In one of my previous assignment we had hundreds of integration tests involving data sets, though not in DBUnit — the test environment was written from scratch, as it was A Very Big Company That Can Afford This Kind Of Stuff.
The data sets were organized hierarchically. The system under test consisted of a few (5-10) modules and the test data followed that pattern. A unit test script looked like this:
include(../../masterDataSet.txt)
include(../moduleDataSet.txt)
# unit-specific test data
someProperty=someData
The property names were mapped directly to DB records by some bizarre tool I can't remember.
Same pattern may be applied to DBUnit tests. In master data set you could place records always need to be — like dictionaries, initial load of the database, as if it were to be installed from scratch.
In module data set you'd put records covering test cases of a majority of tests in a module; I don't suppose an average test of yours involves all of your 70 database tables, does it? You surely must have some functionality groups that could constitute a module, even if the application is monolithic. Try to organize module-level test data around it.
Finally, on the test level, you'd only amend your test set with a minimal number of records needed for this particular tests.
This approach has the enormous benefit of learning; because there are few data files, in time, you actually begin to memorize them. Instead of seeing hundreds of big data sets that differ only by unnoticeable details (which you have to find out each time you come back to a test after a while), you can easily tell how different any two data sets are.
A word on performance at the end. On my 2.4 GHz 2-core WinXP machine a DBUnit test involving:
- dropping 14 tables,
- creating 14 tables,
- inserting ca. 100 records,
- performing the test logic,
takes 1-3 seconds. Logs show that the first 3 operations take less than a second, most of the test time is consumed by Spring. This logic is performed by each test, to avoid test order dependencies. Everything runs in one VM with embedded Derby, this is probably why it's so fast.
EDIT: I think DBUnit XML data sets don't support inclusion of other test files, it can be overcome by using a base class for all integration tests, e.g.:
public class AbstractITest {
@Before
public void setUp() throws Exception {
//
// drop and recreate tables here if needed; we use
// Spring's SimpleJdbcTemplate executing drop/create SQL
//
IDataSet masterDataSet = new FlatXmlDataSetBuilder().build("file://masterDataSet.xml");
DatabaseOperation.CLEAN_INSERT.execute(dbUnitConnection, dataSet);
}
}
public class AbstractModuleITest extends AbstractITest {
@Before
public void setUp() throws Exception {
super.setUp();
IDataSet moduleDataSet = new FlatXmlDataSetBuilder().build("file://moduleDataSet.xml");
DatabaseOperation.CLEAN_INSERT.execute(dbUnitConnection, moduleDataSet);
}
}
public class SomeITest extends AbstractModuleITest {
// The "setUp()" routine only here if needed, remember to call super.setUp().
@Test
public void someTest() { ... }
}