CS133 Lab 4: SimpleDB Transactions

Deadlines:

Lab Description

In this lab, you will implement a simple locking-based transaction system in SimpleDB using page-level locking. You will need to add lock and unlock calls at the appropriate places in your code, as well as code to track the locks held by each transaction and grant locks to transactions as they are needed.

The remainder of this document describes what is involved in adding transaction support and provides a basic outline of how you might add this support to your database.

As with the previous lab, we recommend that you start as early as possible. Locking and transactions can be quite tricky to debug!

Quick jump to exercises:

1. Getting started

You should begin with the code you submitted for Lab 2. (If you did not submit code for Lab 2, or your solution didn't work properly, contact us to discuss options.) Note: you may also add this extra code to your solution for Lab 3 instead of Lab 2. There are two downloads for this Lab:
  1. In the tar file you will download below, we have provided you with extra test cases as well as one new source code file (DeadlockException) for this project that are not in the original code distribution you received. We reiterate that the unit tests we provide are to help guide your implementation along, but they are not intended to be comprehensive or to establish correctness.
  2. You can also download a modified Buffer Pool class: BufferPool.java that contains skeleton code for a Lock Manager class. You may use it to replace your existing Buffer Pool class, use only certain parts, or use none of it if you would rather write your Lock Manager another way. The test cases only assume your Buffer Pool has the same methods.

For the first download, the easiest way to do this is to untar the new code in the same directory as your top-level simpledb directory, as follows:

Now all files from lab 2 and lab 4 will be in the cs133-lab4 directory.

To work in Eclipse, create a new java project named cs133-lab4 like you did for previous labs.

2. Transactions, Locking, and Concurrency Control

Before starting, you should make sure you understand what a transaction is and how strict two-phase locking (which you will use to ensure isolation and atomicity of your transactions) works.

In the remainder of this section, we briefly overview these concepts and discuss how they relate to SimpleDB.

2.1. Transactions

A transaction is a group of database actions (e.g., inserts, deletes, and reads) that are executed atomically; that is, either all of the actions complete or none of them do, and it is not apparent to an outside observer of the database that these actions were not completed as a part of a single, indivisible action.

2.2. The ACID Properties

To help you understand how transaction management works in SimpleDB, we briefly review how it ensures that the ACID properties are satisfied:

2.3. Recovery and Buffer Management

To simplify your job, in this lab you will implement a NO STEAL/FORCE buffer management policy. As we discussed in class, this means that: To further simplify your life, you may assume that SimpleDB will not crash while processing a transactionComplete command. Note that these three points mean that you do not need to implement log-based recovery in this lab, since you will never need to undo any work (you never evict dirty pages) and you will never need to redo any work (you force updates on commit and will not crash during commit processing).

You will implement NO STEAL/FORCE in Exercise 3 (Section 2.6).

2.4. Granting Locks

The modified BufferPool.java includes code that allows a caller to request or release a (shared or exclusive) lock on a specific object on behalf of a specific transaction. The code in the modified BufferPool.java that provides this functionality leaves most of the work to a Lock Manager; you will write much of the code in the Lock Manager.

In the LockManager class, you will need to create data structure(s) that keep track of which locks each transaction holds and that check to see if a lock should be granted to a transaction when it is requested.

You will implement shared and exclusive locks; recall that these work as follows:

A transaction will need to acquire a shared lock on any page before it reads it, and an exclusive lock on any page before it writes it. You will notice that we are already passing around Permissions objects in the BufferPool; these objects indicate the type of lock that the caller would like to have on the object being accessed (we have given you the code for the Permissions class.) Permissions.READ_ONLY indicates you need a shared lock, while Permissions.READ_WRITE requires an exclusive lock.

If a transaction requests a lock that it should not be granted, your code should block, waiting for that lock to become available (i.e., be released by another transaction running in a different thread). The code for waiting (a "sleep") is implemented for you in the Lock Manager class.

Exercise 1. The Buffer Pool will be responsible for granting and releasing page-level locks; most of the work for this functionality is done by a helper class called the Lock Manager in the modified BufferPool.java. LockManager is a private inner class within BufferPool.

Take a look at BufferPool.getPage() and note that before getting the page for the caller, getPage() first attempts to acquire the lock on the requested page using lockmgr. You won't need to add more code to acquireLock() until Exercise 5, but you may wish to look at it briefly now as it uses methods you will implement in this exercise.

For now, you will implement the core functionality in the Lock Manager class for getting and releasing locks.

  1. LockManager constructor: Your constructor should create whatever data structure(s) you will be using to represent your lock table. The design of this is entirely up to you. You may decide to use multiple data structures, create a helper class, etc. As a design guideline, you should ensure your data structure(s) allow you to answer these questions:
    • Given a transactionId, which pages does it have locked?
    • Given a page Id, which transactions hold a lock on the page?
    • Given a page, which Permissions is it locked with?
    If you create a helper class for which you will eventually want to check equality of instances, be sure to implement its equals() method.

  2. lock(): To get a lock, this method first checks if the given transaction can acquire a lock using locked(), which you will implement next. Right now you should just add code to lock() to update your lock table assuming it's okay (this is the "else" case in the code).

  3. locked(): This method returns a boolean indicating whether a transaction is "locked out" from acquiring a lock on the given page with the given permissions. Logic for this method appears in the code in comments above the method. Be careful with == vs. equals and be sure to check Java documentation for whatever data structures you are using for your lock table to see when methods can return null.

  4. holdsLock(): Simple method used by Buffer Pool to determine whether the given transaction has any type of lock on the given page.

  5. releaseLock(): Release whatever lock the given transaction has on the given page, updating your lock table. Used for testing (used by BufferPool.releasePage()) and to help at the end of transactions, as you'll see in Exercise 4.

You may need to implement the next exercise before your code passes the unit tests in LockingTest.

Note: if it seems like LockingTest is hanging forever before it even runs any of its tests, the problem is likely happening in the setup() for LockingTest! Check out what happens there. In particular, does your implementation for locked() allow a transaction to get a lock it already has? And allow a transaction to get a lock on a page if no other transaction holds a lock on that page?

Debugging tip: When running tests with ant, note that the printing of standard output will be delayed until the tests complete. If this is making debugging hard, you could try temporarily commenting out parts of the unit tests.

2.5. Lock Lifetime

Now that you've implemented the core functionality for acquiring and releasing locks, you will need to implement strict two-phase locking. Recall that this means that transactions should acquire the appropriate type of lock on any object before accessing that object and shouldn't release any locks until after the transaction commits (or aborts).

Depending on your implementation, it is possible that you may not have to acquire a lock anywhere besides what you've already implemented in Buffer Pool. It is up to you to verify this in the next exercise!

You will need to think about when to release locks as well. It is clear that you should release all locks associated with a transaction after it has committed or aborted to ensure strict 2PL. You will implement this later in Exercise 4. However, it is possible for there to be other scenarios in which releasing a lock before a transaction ends might be useful. For instance, you may release a shared lock on a page after scanning it to find empty slots (as described in Exercise 2 below).

Exercise 2. Ensure that you acquire and release locks throughout SimpleDB. Some (but not necessarily all) actions that you should verify work properly:
  1. Reading tuples off of pages during a SeqScan (if you implemented locking in BufferPool.getPage(), this should work correctly as long as your HeapFile.iterator() uses BufferPool.getPage().)

  2. Inserting and deleting tuples through BufferPool and HeapFile methods. Note that your implementation of HeapFile.insertTuple() and HeapFile.deleteTuple(), as well as the implementation of the iterator returned by HeapFile.iterator() should access pages using BufferPool.getPage(). Double check that that these different uses of BufferPool.getPage() pass the correct permissions object (e.g., Permissions.READ_WRITE or Permissions.READ_ONLY).

  3. Marking dirty pages. You may also wish to double check that your implementation of BufferPool.insertTuple() and BufferPool.deleteTupe() call markDirty() on any of the pages they access (you should have done this when you implemented this code in Lab 2, but we did not test for this case.)

  4. You will also want to think about acquiring and releasing locks in the following situations:
    • Adding a new page to a HeapFile. When do you physically write the page to disk? Are there race conditions with other transactions (on other threads) that might need special attention at the HeapFile level, regardless of page-level locking? You may need to synchronize the part of your code where you write a blank page to disk, e.g.,
      synchronized (this) { // code to write blank page }
    • Looking for an empty slot into which you can insert tuples. Most implementations scan pages looking for an empty slot, and will need a READ_ONLY lock to do this. Surprisingly, however, if a transaction t finds no free slot on a page p, t may immediately release the lock on p. Although this apparently contradicts the rules of two-phase locking, it is ok because t did not use any data from the page, such that a concurrent transaction t' which updated p cannot possibly effect the answer or outcome of t.
At this point, your code should pass the unit tests in LockingTest.
Code hanging? See the notes at the end of Exercise 1.

2.6. Implementing NO STEAL

In a NO STEAL policy, updates from a transaction cannot be written to disk before it commits. This means we must be sure not to evict dirty pages from the buffer pool until commit time.

Note that, in general, evicting a clean page that is locked by a running transaction is OK when using NO STEAL, as long as your lock manager keeps information about evicted pages around, and as long as none of your operator implementations keep references to Page objects which have been evicted. You don't need to do this for the lab.

Exercise 3. Implement the necessary logic for page eviction without evicting dirty pages. You will need to modify the evictPage method in BufferPool. In particular, it must never evict a dirty page. If your eviction policy prefers a dirty page for eviction, you will have to find a way to evict an alternative page. In the case where all pages in the buffer pool are dirty, you should throw a DbException.

This functionality is not tested until you've completed Exercise 4.

2.7. Transactions

In SimpleDB, a TransactionId object is created at the beginning of each query. This object is passed to each of the operators involved in the query. When the query is complete, the BufferPool method transactionComplete is called.

Calling transactionComplete either commits or aborts the transaction, as specified by the parameter flag commit. At any point during its execution, an operator may throw a TransactionAbortedException exception, which indicates an internal error or deadlock has occurred. The test cases we have provided you with create the appropriate TransactionId objects, pass them to your operators in the appropriate way, and invoke transactionComplete when a query is finished. We have also implemented TransactionId.

Exercise 4. You will now implement transactionComplete to finish a transaction, adhering to a FORCE policy and Strict 2PL.
  1. transactionComplete(tid,commit): This method should first deal with dirty pages in the buffer pool for committing or aborting to adhere to a FORCE policy. It then calls releaseAllLocks, which you will implement next. If the xact is committing (i.e., commit==true ), you should flush dirty pages associated with the xact to disk. Else, if the xact is aborting, you should throw away any changes to pages that it made (this can be done by removing the page from the buffer pool). Note that there is another version of transactionComplete that takes a single argument; you do not need to add any code there.

  2. releaseAllLocks(tid): This method in the Lock Manager should update your lock table data structure(s) to release all locks held by the given xact. Be careful: Java doesn't like it if you are iterating over a collection while you are also removing elements from that collection; you may end up seeing a ConcurrentModificationException.

At this point, your code should pass the TransactionTest unit test and the AbortEvictionTest system test. You may find the TransactionTest system test illustrative, but it will likely fail until you complete the next exercise.

2.8. Deadlocks and Aborts

It is possible for transactions in SimpleDB to deadlock due to a cycle of transactions waiting for each other to release locks. You will need to detect and resolve this situation!

There are many possible ways to detect deadlock. For example, you may:

After you have detected that a deadlock exists, you must improve the situation. Suppose you have detected a deadlock while transaction t is waiting for a lock; you can decide to abort t to give other transactions a chance to make progress. In the LockManager class, this is most easily done by aborting the xact t when it tries to acquire a lock that will cause a cycle (if you are trying the waits-for-graph approach) or if too much time has elapsed (if using the timeout approach).

Exercise 5. Implement deadlock detection and resolution in BufferPool.java. If you are using the Lock Manager class, you will be checking for deadlocks (and possibly throwing a DeadLockException) in acquireLock().

Most likely, you will want to check for a deadlock whenever a transaction attempts to acquire a lock and finds another transaction is holding the lock (note that this by itself is not a deadlock, but may be symptomatic of one). For the waits-for-graph approach, you could check if a cycle of transactions waiting has formed. You have many decisions for your deadlock resolution system, but it is not necessary to do something complicated. Please describe your choices in the lab writeup.

Aborting a transaction
To abort a transaction in acquireLock, you can simply throw a DeadlockException. This is caught by BufferPool.getPage and converted to a TransactionAbortedException, which in turn will be caught by the code executing the transaction. (e.g., TransactionTest.java), which should call transactionComplete() to clean up after the transaction. You are not expected to automatically restart a transaction which fails due to a deadlock -- you can assume that higher level code in the unit tests will take care of this.

Unit testing
We have provided some (not-so-unit) tests in DeadlockTest. They are actually a bit involved, so they may take more than a few seconds to run (depending on your policy). If they seem to hang indefinitely, then you probably have an unresolved deadlock. These tests construct simple deadlock situations that your code should be able to escape. The tests will print TransactionAbortedExceptions corresponding to the deadlocks it successfully resolved to the console.

Note that there are two timing parameters near the top of DeadLockTest.java; these determine the frequency at which the test checks if locks have been acquired and the waiting time before an aborted transaction is restarted. You may observe different performance characteristics by tweaking these parameters if you use a timeout-based detection method.

In addition to DeadlockTest, your code should now should pass the TransactionTest system test (which may also run for quite a long time, but timing out at 10 minutes).

Debugging tip: TransactionTest runs actual queries--see the run method in XactionTester. So if DeadlockTest passes but TransactionTest does not, the issue may lie with the query plan operators that are used by this test, namely, Insert.java and Delete.java. If your implementation of the operators suppresses TransactionAbortedExceptions because it catches all exceptions, that could be the issue. Feel free to consult Lab 2 solution code.

3. Logistics

You must submit your code (see below) as well as a short (3 page, maximum) writeup describing your approach. This writeup should:

3.1. Collaboration

This lab can be completed alone or with a partner. Please indicate clearly who you worked with, if anyone, on your individual writeup.

3.2. Submitting your assignment

For all deadlines besides the final version, you only need to submit cs133-lab4.tar.gz tarball (such that, untarred, it creates a cs133-lab4/src/java/simpledb directory with your code). You can use the ant handin target to generate the tarball.

For the final version of the lab, the files you need to submit are:

Submit all your files for Exercise 1-2 under "Lab 4: Part 1" and the final version of the lab under "Lab 4: Final" on Sakai.

3.3 Grading

Your grade for the lab will be based on the final version after all exercises are complete.

75% of your grade will be based on whether or not your code passes the system test suite we will run over it. These tests will be a superset of the tests we have provided. Before handing in your code, you should make sure produces no errors (passes all of the tests) from both ant test and ant systemtest.

Important: before testing, we will replace your build.xml and the entire contents of the test directory with our version of these files. This means you cannot change the format of .dat files! You should also be careful changing our APIs. You should test that your code compiles the unmodified tests. In other words, we will untar your tarball, replace the files mentioned above, compile it, and then grade it. It will look roughly like this:

$ tar xvzf cs133-lab4.tar.gz
$ cd ./cs133-lab4
[replace build.xml and test]
$ ant test
$ ant systemtest
[additional tests]

If any of these commands fail, we'll be unhappy, and, therefore, so will your grade.

An additional 25% of your grade will be based on the quality of your writeup, our subjective evaluation of your code, and on-time submission for the intermediate deadlines.

ENJOY!!

Acknowledgements

Thanks to our friends and colleagues at MIT and UWashington for doing all the heavy lifting on creating SimpleDB!