What is the best approach to write unit tests for code that persists data to nosql data store, in our case cassandra?
=> We are using embedded server approach using a utility from git hub (https://github.com/hector-client/hector/blob/master/test/src/main/java/me/prettyprint/hector/testutils/EmbeddedServerHelper.java). However I have been seeing some issues with this. 1) It persists data across multiple test cases making it hard for us to make sure data is different in test cases of a test class. I tried calling cleanUp @After each test case, but that doesn't seem to cleanup data. 2) We are running out of memory as we add more tests and this could be because of 1, but I am not sure yet on that. I currently have 1G heap size to run my build.
=> The other approach I have been thinking is to mock the cassandra storage. But that might leak some issues in the cassandra schema as we often found the above approach catching issues with the way data is stored into cassandra.
Please let me know you thoughts on this and if anyone has used EmbeddedServerHelper and are familiar with the issues I have mentioned.
Just an update. I was able to resolve 2) running out of java heap space issue when running builds by changing the in_memory_compaction_limit_in_mb parameter to 32 in the cassandra.yaml used by the test embedded server. The below link helped me http://www.datastax.com/docs/0.7/configuration/storage_configuration#in-memory-compaction-limit-in-mb. It was 64 and started to fail consistently during compaction.