Do not update row in ResultSet if data has changed
Asked Answered
R

3

14

we are extracting data from various database types (Oracle, MySQL, SQL-Server, ...). Once it is successfully written to a file we want to mark it as transmitted, so we update a specific column.

Our problem is, that a user has the possibility to change the data in the meantime but might forget to commit. The record is blocked with a select for update statement. So it can happen, that we mark something as transmitted, which is not.

This is an excerpt from our code:

Statement stmt = conn.createStatement(ResultSet.TYPE_SCROLL_SENSITIVE, ResultSet.CONCUR_UPDATABLE);
ResultSet extractedData = stmt.executeQuery(sql);
writeDataToFile(extractedData);
extractedData.beforeFirst();
while (extractedData.next()) {
    if (!extractedData.rowUpdated()) {
        extractedData.updateString("COLUMNNAME", "TRANSMITTED");
        // code will stop here if user has changed data but did not commit
        extractedData.updateRow();
        // once committed the changed data is marked as transmitted
    }
}

The method extractedData.rowUpdated() returns false, because technically the user didn't change anything yet. Is there any way to not update the row and detect if data was changed at this late stage?

Unfortunately I cannot change the program the user is using to change the data.

Rocker answered 11/2, 2016 at 12:2 Comment(9)
This feels like an issue with transaction isolation. I guess if you'd configure your connection for uncommitted-read you might see the user change. Although this would then also depend on when the row you look at has been fetched - which might be anytime in your cursor operation.Shalne
Good idea, but I guess that would also show changes which are possibly discarded later on. Is there maybe a way that I could detect a lock from another user? For example if record locked then skip it?Rocker
As transaction is neither committed nor rolled back you wouldn't see the difference. How about you lock the rows for export?Shalne
@Shalne Can't do that unfortunately. It is possible that the export takes hours and I can't lock the table or record that long.Rocker
Use some form of optimistic locking instead of doing if (not updated) then {update} logic.Tripedal
I removed the if clause, because it could not detect commited changes from outside of my program. Seems like it only detects changes to the record in the actual result set object. I have looked at optimistic locking and currently don't see an option to implement this if it involves adding an additional column. I'll see if I can use the current column.Rocker
update to the above comment: rowUpdated() returned false, because during testing I used an Oracle database: "The driver (Oracle JDBC) will not report any changes made by another committed transaction. Any conflicts are silently ignored and your changes will overwrite the previous changes." I guess the only solution would be a lock as @Shalne suggested.Rocker
What is your major pain in this? A) exporting values that are no longer valid, B) not exporting changed values as that row has been marked "okay" by your run, C) something else?Shalne
@Shalne that would be BRocker
S
1

So you want to

  • Run through all rows of the table that have not been exported
  • Export this data somewhere
  • Mark these rows exported so your next iteration will not export them again
  • As there might be pending changes on a row, you don't want to mess with that information

How about:

You iterate over all rows. 

for every row 
   generate a hash value for the contents of the row
   compare column "UPDATE_STATUS" with calulated hash
   if no match
     export row
     store hash into "UPDATE_STATUS" 
      if store fails (row locked) 
         -> no worries, will be exported again next time
      if store succeeds (on data already changed by user) 
         -> no worries, will be exported again as hash will not match

This might further slow your export as you'll have to iterate over everything instead of over everything WHERE UPDATE_STATUS IS NULL but you might be able to do two jobs - one (fast) iterating over WHERE UPDATE_STATUS IS NULL and one slow and thorough WHERE UPDATE_STATUS IS NOT NULL (with the hash-rechecking in place)

If you want to avoid store-failures/waits, you might want to store the hash /updated information into a second table copying the primary key plus the hash field value - that way user locks on the main table would not interfere with your updates at all (as those would be on another table)

Shalne answered 19/2, 2016 at 12:56 Comment(3)
Thanks alot for your time. An issue might be that the store never fails on a lock. Is there a way to detect a lock in jdbc? or even enter a time to wait? I'll give it a try though. Will have to talk to my boss as this involves changing hundreds of tables on hundreds of databases :)Rocker
If you put the calculated hash values into one separate table (sourcetable, key, hash), you might get away without changing what's there.Shalne
I will try this and post the final solution in my question, once done.Rocker
P
0

"a user [...] might forget to commit" > A user either commits or he doesn't. "Forgetting" to commit is tantamount to a bug in his software.

To work around that you need to either:

  • Start a transaction with isolation level SERIALIZABLE, and within that transaction:
    • Read the data and export it. Data read this way is blocked from being updated.
    • Update the data you processed. Note: don't do that with an updateable ResultSet, do that with an UPDATE statement. That way you don't need an CONCUR_UPDATABLE + TYPE_SCROLL_SENSITIVE which is much slower than a CONCUR_READ_ONLY + TYPE_FORWARD_ONLY.
  • Commit the transaction.

That way the buggy software will be blocked from updating data you are processing.

Another way

  • Start a TRANSACTION at a lower isolation level (default READ COMMITTED) and within that transaction
    • Select the data with proper Table Hints Eg for SQL Server these: TABLOCKX + HOLDLOCK (large datasets), or ROWLOCK + XLOCK + HOLDLOCK (small datasets), or PAGLOCK + XLOCK + HOLDLOCK. Having HOLDLOCK as a table hint is practically equivalent to having a SERIALIZABLE transaction. Note that lock escalation may escalate the latter two to table locks if the number of locks becomes too high.
    • Update the data you processed; Note: use an UPDATE statement. Lose the updatable/scroll_sensitive resultset.
  • Commit the TRANSACTION.

Same deal, the buggy software will be blocked from updating data you are processing.

Pugh answered 19/2, 2016 at 19:25 Comment(0)
R
0

In the end we had to implement optimistic locking. In some tables we already have a column that stores the version number. Some other tables have a timestamp column that holds the time of the last change (changed by trigger).

While a timestamp might not always be a reliable source for optimistic locking we went with it anyway. Several changes during a single second are not very realistic in our environment.

Since we have to know the primary key without describing it before hand, we had to access the resultset metadata. Some of our databases do not support this (DB/2 legacy tables for example). We are still using the old system for these.

Note: The tableMetaData is an XML-config file where our description of the table is stored. This is not directly related to the metadata of the table in the database.

Statement stmt = conn.createStatement(ResultSet.TYPE_SCROLL_SENSITIVE, ResultSet.CONCUR_UPDATABLE);
ResultSet extractedData = stmt.executeQuery(sql);
writeDataToFile(extractedData);
extractedData.beforeFirst();
while (extractedData.next()) {
    if (tableMetaData.getVersion() != null) {
        markDataAsExported(extractedData, tableMetaData);
    } else {
        markResultSetAsExported(extractedData, tableMetaData);
    }
}

// new way with building of an update statement including the version column in the where clause
private void markDataAsExported(ResultSet extractedData, TableMetaData tableMetaData) throws SQLException {
    ResultSet resultSetPrimaryKeys = null;
    PreparedStatement versionedUpdateStatement = null;
    try {
        ResultSetMetaData extractedMetaData = extractedData.getMetaData();
        resultSetPrimaryKeys = conn.getMetaData().getPrimaryKeys(null, null, tableMetaData.getTable());
        ArrayList<String> primaryKeyList = new ArrayList<String>();
        String sqlStatement = "update " + tableMetaData.getTable() + " set " + tableMetaData.getUpdateColumn()
                + " = ? where ";
        if (resultSetPrimaryKeys.isBeforeFirst()) {
            while (resultSetPrimaryKeys.next()) {
                primaryKeyList.add(resultSetPrimaryKeys.getString(4));
                sqlStatement += resultSetPrimaryKeys.getString(4) + " = ? and ";
            }
            sqlStatement += tableMetaData.getVersionColumn() + " = ?";
            versionedUpdateStatement = conn.prepareStatement(sqlStatement);
            while (extractedData.next()) {
                versionedUpdateStatement.setString(1, tableMetaData.getUpdateValue());
                for (int i = 0; i < primaryKeyList.size(); i++) {
                    versionedUpdateStatement.setObject(i + 2, extractedData.getObject(primaryKeyList.get(i)),
                            extractedMetaData.getColumnType(extractedData.findColumn(primaryKeyList.get(i))));
                }
                versionedUpdateStatement.setObject(primaryKeyList.size() + 2,
                        extractedData.getObject(tableMetaData.getVersionColumn()), tableMetaData.getVersionType());
                if (versionedUpdateStatement.executeUpdate() == 0) {
                    logger.warn(Message.COLLECTOR_DATA_CHANGED, tableMetaData.getTable());
                }
            }
        } else {
            logger.warn(Message.COLLECTOR_PK_ERROR, tableMetaData.getTable());
            markResultSetAsExported(extractedData, tableMetaData);
        }
    } finally {
        if (resultSetPrimaryKeys != null) {
            resultSetPrimaryKeys.close();
        }
        if (versionedUpdateStatement != null) {
            versionedUpdateStatement.close();
        }
    }
}

//the old way as fallback
private void markResultSetAsExported(ResultSet extractedData, TableMetaData tableMetaData) throws SQLException {
    while (extractedData.next()) {
        extractedData.updateString(tableMetaData.getUpdateColumn(), tableMetaData.getUpdateValue());
        extractedData.updateRow();
    }
}
Rocker answered 25/2, 2016 at 13:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.