JDBC Batch Insert OutOfMemoryError
Asked Answered
V

2

25

I have written a method insert() in which I am trying to use JDBC Batch for inserting half a million records into a MySQL database:

public void insert(int nameListId, String[] names) {
    String sql = "INSERT INTO name_list_subscribers (name_list_id, name, date_added)" + 
        " VALUES (?, ?, NOW())";
    Connection conn = null;
    PreparedStatement ps = null;

    try {
        conn = getConnection();
        ps = conn.prepareStatement(sql);

        for (String s : names ) {
            ps.setInt(1, nameListId); 
            ps.setString(2, s);
            ps.addBatch();
        }

        ps.executeBatch();

    } catch (SQLException e) {
        throw new RuntimeException(e);
    } finally {
        closeDbResources(ps, null, conn);
    }
}

But whenever I try to run this method, I get the following error:

java.lang.OutOfMemoryError: Java heap space
    com.mysql.jdbc.ServerPreparedStatement$BatchedBindValues.<init>(ServerPreparedStatement.java:72)
    com.mysql.jdbc.ServerPreparedStatement.addBatch(ServerPreparedStatement.java:330)
    org.apache.commons.dbcp.DelegatingPreparedStatement.addBatch(DelegatingPreparedStatement.java:171)

If I replace ps.addBatch() with ps.executeUpdate() and remove ps.executeBatch(), it works fine, though it takes some time. Please let me know if you know if using Batch is appropriate in this situation, and if it is, then why does it give OurOfMemoryError?

Thanks

Vanthe answered 9/2, 2010 at 8:5 Comment(0)
K
50

addBatch and executeBatch give you the mechanism to perform batch inserts, but you still need to do the batching algorithm yourself.

If you simply pile every statement into the same batch, as you are doing, then you'll run out of memory. You need to execute/clear the batch every n records. The value of n is up to you, JDBC can't make that decision for you. The larger the batch size, the faster things will go, but too large and you'll get memory starvation and things will slow down or fail. It depends how much memory you have.

Start off with a batch size of 1000, for example, and experiment with different values from there.

final int batchSize = 1000;
int count = 0;
for(String s : names ) {
   ps.setInt(1, nameListId); 
   ps.setString(2, s);
   ps.addBatch();

   if (++count % batchSize == 0) {
      ps.executeBatch();
      ps.clearBatch(); //not sure if this is necessary
   }
}
ps.executeBatch();   // flush the last few records.
Kathe answered 9/2, 2010 at 8:34 Comment(3)
Will this work within a transaction as well, allowing for rollback of all executed batches if something goes wrong?Illogic
I know this is a bit late but I tried using clearBatch() , it doesn't seem to help. It still gives OOM. Are there any better way to tackle this?Micropyle
You don't need ps.clearBatch() for MySQL JDBC driver. It is called internally with each ps.executeBatch() in StatementImpl.executeBatchInternal().Oyster
R
7

It is out of memory because it hold all the transaction in memory and only send it over to the database when you call executeBatch.

If you don't need it to be atomic and would like the get better performance, you can keep a counter and call executeBatch every n number of records.

Ribonuclease answered 9/2, 2010 at 8:18 Comment(3)
and what should be the value of n?Vanthe
The value is up to you, you have to benchmark your application to get the best value for that you want for the trade off between memory and performance.Ribonuclease
I think it will be anyway atomic (in the same transaction).Oyster

© 2022 - 2024 — McMap. All rights reserved.