Grails, Cassandra: Giving each test a clean DB to work with

If you are using Grails, you must be used to the ease with which your integration tests do not have to bother about what other tests are doing in the application database. Grails achieves it by starting a new transaction before each test and rolls it back when the test gets over – whether it passes or fails. So, the actual changes made to data are never written to the database and you are happy to have this isolation between tests, as it allows you to focus one each test independently instead of worrying about side-effects.

But what if the database you used did not support transactions? How do you achieve the same isolation then?

For such a database – Cassandra – it can be achieved by cleaning up after each integration test the data from all the Column Families (tables) that the Keyspace (schema) had (Ability to hook into Grails events made it quite easy). Not every test makes too many changes in the database, so it can work out quite OK – obviously, not as fast as Hibernate + RDBMS combo, but acceptable.

The steps I used:

  • Describe the keyspace and find out all the column families in it
  • For each column family you want to clean-up data from:
    • Do range queries on the column family, skipping any phantom records found (Use Pagination, if needed)
    • Collect all valid row keys
    • Do a batched mutation for all the row keys to clean-up the column family data

So, all worked well, but it seemed like too much clean-up work. I looked around and found that Cassandra supported a “truncate” operation, which was sadly missing in Hector API, so I promptly requested for the API gap to be filled. The “truncate” operation was soon made available in Hector API and it cut down our clean-up code from 20-lines of doing range-scans-and-picking-up-keys-to-delete to one-liner “truncate” call.

Over time, the schema grew (more column families, indexes) as well as the application (more integration tests needing more clean-up cycles) and the integration tests that used to finish in seconds started taking many minutes. For some time I doubted that maybe Cassandra upgrades have introduced some new configuration that should be tweaked, but then I measured. To my utmost surprise, it showed that each invocation of the one-liner “truncate” call had started spending 4-5 seconds kind of time in Hector + Cassandra, and in an overall integration tests cycle of 5.5 minutes, 5 minutes were in clean-up and 0.5 in actual tests πŸ™‚

We immediately switched back to olden ways of doing range-scans-and-batch-mutations and here is the difference in overall clean-up time (nearly 75 invocations in the whole integration test phase): 5 minutes vs 0.05 minutes! So, we are happily back to our tests taking 0.55 minute instead of 5.5 minutes!

Please, please do not use Hector + Cassandra’s “truncate” calls frequently to provide your tests a clean DB slate in your Grails / Java applications! Its performance is bad!

Also, sometimes 20 lines of code can be better than a one-liner. πŸ™‚