Friday, January 18, 2008

Optimizing Java DB in unit test environments

A number of folks use Java DB/Apache Derby for their unit testing. In this environment, your requirements are different:

  • You don't care about durability. If some data is lost, fine, run the test again
  • You do care about getting through your tests as quickly as possible
  • You want to regularly start with a fresh database, and you don't want "crud" left around at the end of the unit test. In particular, you don't want your test database left around.
Because you don't care about durability, you don't necessarily have to write to disk. You may not be aware of this, but some directories on your system may not be mounted to a physical disk, but instead are mapped to memory. In particular, /tmp is normally mapped to memory.

So if you set your database to point to /tmp, you should get some significant performance improvements, e.g. jdbc:derby:/tmp/mydb.

I'm not sure how you would do this on Windows. Here is one possible tip - note I haven't tried this myself. Just the first hit on Google.

Another thing you can do to speed things up is to configure Java DB to not bother synching to disk. You do this by setting derby.system.durability=test as either a system property when you start your VM, or in the derby.properties file. Specifically, when you set this property,
the store system will not force I/O synchronization calls for:
  • the log file at each commit.
  • the log file before a data page is forced to disk.
  • page allocation when a file is grown.
  • for data writes during checkpoints.
You can see this documentation page for more information.

Now, what about removing that pesky database directory when you're done with a test? Java DB does not provide any mechanism for doing that, so you're going to have to do this yourself.

There are two approaches to doing this. First of all, you could drop all the tables after each test run and re-create them, re-using the same database each time.

Alternately, you can have a "model" database set up, and after each test run you use Java file APIs to blow away the old database directory [1] and replace it with a fresh "model".

Note: before you try to remove the database, you should probably shut it down. Do this by connecting to it with the ;shutdown=true" property added to your URL.

The trick is, your code needs to know where this database directory lives. Here are some possible approaches:
  • If you are fully qualifying paths to your databases, you know exactly where they live. But usually you wouldn't do this, because it makes your tests non-portable.
  • You can let test runners specify a property that indicates where the database directories should go. Then you prepend this to the JDBC URL, and also use it when you're blowing away the directory. For example:

public class MyTest {

private Connection conn;
private String dbdir;
private String dbname;

public void setUp() {
dbdir = getTestProperty("database.dir", "/tmp");
dbname = getTestProperty("database.name", "mydb");
copyModel(dbdir, dbname);
String url = "jdbc:derby:" + dbdir + "/" + dbname;
conn = connectToDb(url);
}

public void tearDown() {
conn = connectToDb(url = ";shutdown = true");
deleteDatabase(dbdir, dbname);
}
  • You can use some knowledge about how Java DB determines where to place a database. It follows the basic simple rule: if derby.system.home is set, place the database directory there. If that's not set, use the Java system property user.dir. So you can follow the same logic to figure out where your database is and then blow it away (and put a new one there in its place).


[1] You can write the code yourself to blow away a directory, or use the handy FileUtils.deleteDirectory() from Apache Commons IO.

No comments: