Concurrent reading and updating in a database table

Asked 27/2, 2013 at 15:36 Answered 27/2, 2013 at 16:10

Solved c#oracle entity-framework transactions devart

I have an Oracle database that I access using Devart and Entity Framework.

There's a table called IMPORTJOBS with a column STATUS.

I also have multiple processes running at the same time. They each read the first row in IMPORTJOBS that has status 'REGISTERED', put it to status 'EXECUTING', and if done put it to status 'EXECUTED'.

Now because these processes are running in parallel, I believe the following could happen:

process A reads row 10 which has status REGISTERED,
process B also reads row 10 which has still status REGISTERED,
process A updates row 10 to status EXECUTING.

Process B should not be able to read row 10 as process A already read it and is going to update its status.

How should I solve this? Put read and update in a transaction? Or should I use some versioning approach or something else?

Thanks!

EDIT: thanks to the accepted answer I got it working and documented it here: http://ludwigstuyck.wordpress.com/2013/02/28/concurrent-reading-and-writing-in-an-oracle-database.

Farfamed answered 27/2, 2013 at 15:36 Comment(0)

You should use the built-in locking mechanisms of the database. Don't reinvent the wheel, especially since RDBMS are designed to deal with concurrency and consistency.

In Oracle 11g, I suggest you use the SKIP LOCKED feature. For example each process could call a function like this (assuming id are number):

CREATE OR REPLACE TYPE tab_number IS TABLE OF NUMBER;

CREATE OR REPLACE FUNCTION reserve_jobs RETURN tab_number IS
   CURSOR c IS 
      SELECT id FROM IMPORTJOBS WHERE STATUS = 'REGISTERED'
      FOR UPDATE SKIP LOCKED;
   l_result tab_number := tab_number();
   l_id number;
BEGIN
   OPEN c;
   FOR i IN 1..10 LOOP
      FETCH c INTO l_id;
      EXIT WHEN c%NOTFOUND;
      l_result.extend;
      l_result(l_result.size) := l_id;
   END LOOP;
   CLOSE c;
   RETURN l_result;
END;

This will return 10 rows (if possible) that are not locked. These rows will be locked and the sessions will not block each other.

In 10g and before since Oracle returns consistent results, use FOR UPDATE wisely and you should not have the problem that you describe. For instance consider the following SELECT:

SELECT *
  FROM IMPORTJOBS 
 WHERE STATUS = 'REGISTERED'
   AND rownum <= 10
FOR UPDATE;

What would happen if all processes reserve their rows with this SELECT? How will that affect your scenario:

Session A gets 10 rows that are not processed.
Session B would get the same 10 rows, is blocked and waits for session A.
Session A updates the selected rows' statuses and commits its transaction.
Oracle will now (automatically) rerun Session B's select from the beginning since the data has been modified and we have specified FOR UPDATE (this clause forces Oracle to get the last version of the block).
This means that session B will get 10 new rows.

So in this scenario, you have no consistency problem. Also, assuming that the transaction to request a row and change its status is fast, the concurrency impact will be light.

Josephus answered 27/2, 2013 at 16:10 Comment(3)

thank you, I'm trying to execute "SELECT * FROM IMPORTJOBS WHERE STATUSCODE = 'REGISTERED' AND ROWNUM <= 1 FOR UPDATE SKIP LOCKED", but it's still returning the same row from the different processes? – Farfamed 28/2, 2013 at 8:40

(1) make sure that you have turned autocommit off: you can't lock a row without a transaction. (2) FOR UPDATE SKIP LOCKED and rownum won't work as you expect -- this is because the SKIP LOCKED is evaluated after the WHERE clause. Use a select without rownum, fetch one (or more as needed) row and close the cursor, that's the best way to use SKIP LOCKED. – Josephus 28/2, 2013 at 8:49

indeed, I had to put the select and update in a transaction, now it works. Thanks!!! – Farfamed 28/2, 2013 at 9:12

Each process can issue a SELECT ... FOR UPDATE to lock the row when they read it. In this scenario, process A will read and lock the row, process B will attempt to read the row and block until process A releases the lock by committing (or rolling back) its transaction. Oracle will then determine whether the row still meets B's criteria and, in your example, won't return the row to B. This works but it means that your multi-threaded process may now be effectively single-threaded depending on how your transaction control needs to work.

Possible ways to improve scalability

A relatively common approach on the consumer to resolving this is to have a single coordinator thread that reads the data from the table, parcels out work to different threads, and updates the table appropriately (including knowing how to re-assign a job if the thread that was assigned it has died).
If you are using Oracle 11.1 or later, you can use the SKIP LOCKED clause on your FOR UPDATE so that each session gets back the first row that meets their criteria and is not locked (the clause existed in earlier versions but was not documented so it may not work correctly).
Rather than using a table for ImportJobs, you can use a queue with multiple consumers. This will allow Oracle to distribute messages to each process without you needing to build any additional locking (Oracle queues are doing it all behind the scenes).

Sleety answered 27/2, 2013 at 16:1 Comment(0)

Use versioning and optimistic concurrency.

The IMPORTJOBS table should have a timestamp column that you mark as ConcurrencyMode = Fixed in your model. Now when EF tries to do an update the timestamp column is incorporated in the update statement: WHERE timestamp = xxxxx.

For B, the timestamp changed in the mean time, so a concurrency exception is raised, which, in this case, you handle by skipping the update.

I'm from a SQL server background and I don't know the Oracle equivalent of timestamp (or rowversion), but the idea is that it's a field that auto-updates when an update is made to a record.

Calabro answered 27/2, 2013 at 16:2 Comment(0)

Recommended topics

Hot tags