Optimistic Locking by concrete (Java) example
Asked Answered
L

2

42

I have spent my morning reading all the top articles that Google churns up on optimistic locking, and for the life of me, I still don't really get it.

I understand that optimistic locking involves the addition of a column for tracking the record's "version", and that this column can be a timestamp, a counter, or any other version-tracking construct. But I'm still not understanding how that ensures WRITE integrity (meaning that if multiple process are updating the same entity at the same time, that afterwards, the entity correctly reflects the true state it should be in).

Can someone provide a concrete, easy-to-understand example of how optimistic locking could be used in Java (against, perhaps, a MySQL DB). Let's say we have a Person entity:

public class Person {
    private String firstName;
    private String lastName;
    private int age;
    private Color favoriteColor;
}

And that Person instances get persisted to a people MySQL table:

CREATE TABLE people (
    person_id PRIMARY KEY AUTO_INCREMENT,
    first_name VARCHAR(100) NOT NULL,
    last_name VARCHAR(100) NOT NULL,        # } I realize these column defs are not valid but this is just pseudo-code
    age INT NOT NULL,
    color_id FOREIGN KEY (colors) NOT NULL  # Say we also have a colors table and people has a 1:1 relationship with it
);

Now let's say there are 2 software systems, or 1 system with 2 threads on it, that are trying to update the same Person entity at the same time:

  • Software/Thread #1 is trying to persist a surname change (from "John Smith" to "John Doe")
  • Software/Thread #2 is trying to persist a change in the favorite color (from RED to GREEN)

My questions:

  1. How could optimistic locking be implemented on the people and/or colors tables? (Looking for specific DDL example)
  2. How could you then utilize this optimistic locking at the application/Java layer? (Looking for specific code example)
  3. Can someone run me through a scenario where the DDL/code changes (from #1 and #2 above) would come into play in my scenario (or any other scenario) and would "optimistically lock" the people/colors tables correctly? Basically, I'm looking to see optimistic locking in action, with an easy-to-follow explanation of why it works.
Limy answered 14/1, 2014 at 17:31 Comment(1)
What do you want to lock, the java objects in memory or the entries in your database?Beabeach
M
50

Normally when you look into optimistic locking you also use a library like Hibernate or an other JPA-Implementation with @Version support.

Example could read like this:

public class Person {
    private String firstName;
    private String lastName;
    private int age;
    private Color favoriteColor;
    @Version
    private Long version;
}

while obviously there is no point of adding a @Version annotation if you are not using a framework which supports this.

The DDL could then be

CREATE TABLE people (
    person_id PRIMARY KEY AUTO_INCREMENT,
    first_name VARCHAR(100) NOT NULL,
    last_name VARCHAR(100) NOT NULL,        # } I realize these column defs are not valid but this is just pseudo-code
    age INT NOT NULL,
    color_id FOREIGN KEY (colors) NOT NULL,  # Say we also have a colors table and people has a 1:1 relationship with it
    version BIGINT NOT NULL
);

What happens with the version?

  1. Every time before you store the entity, you check if the version stored in the database is still the version you know.
  2. If it is, store your data with version incremented by one

To get both steps done without risking an other process changing data between both steps it is normally handled through a statement like

UPDATE Person SET lastName = 'married', version=2 WHERE person_id = 42 AND version = 1;

After executing the statement you check if you updated a row or not. If you did, nobody else changed the data since you've read it, otherwise somebody else changed the data. If somebody else changed the data you will normally receive an OptimisticLockException by the library you are using.

This exception should cause all changes to be revoked and the process of changing the value to be restarted as the condition upon which the entity was to be updated may no longer be applicable.

So no collision:

  1. Process A reads Person
  2. Process A writes Person thereby incrementing version
  3. Process B reads Person
  4. Process B writes Person thereby incrementing version

Collision:

  1. Process A reads Person
  2. Process B reads Person
  3. Process A writes Person thereby incrementing version
  4. Process B receives an exception when trying to save as the version changed since Person was read

If Colour is another object you should put a version there by the same scheme.

What isn't Optimistic Locking?

  • Optimistic Locking is no magic to merge conflicting changes. Optimistic Locking will just prevent processes from accidentally overwriting changes by another process.
  • Optimistic Locking actually is no real DB-Lock. It just works by comparing the value of the version column. You don't prevent other processes from accessing any data, so expect that you get OptimisticLockExceptions

What column-type to use as version?

If many different applications access your data you may be best off using a column automatically updated by the database. e.g. for MySQL

version TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP;

this way the applications implementing optimistic locking will notice changes by dumb applications.

If you update entities more often than the resolution of TIMESTAMP or the Java-interpretation of it, ths approach can fail to detect certain changes. Also if you let Java generate the new TIMESTAMP you need to ensure that all machines running your applications are in perfect time-sync.

If all of your applications can be altered an integer, long, ... version is normally a good solution as it will never suffer from differently set clocks ;-)

There are other scenarios. You could e.g. use a hash or even randomly generate a String every time a row is to be changed. Important is, that you don't repeat values while any process is holding data for local processing or inside a cache as that process will not be able to detect change by looking at the version-column.

As a last resort you may use the value of all fields as version. While this will be the most expensive approach in most cases it is a way to get similar results without changing the table structure. If you use Hibernate there is the @OptimisticLocking-annotation to enforce this behavior. Use @OptimisticLocking(type = OptimisticLockType.ALL) on the entity-class to fail if any row changed since you have read the entity or @OptimisticLocking(type = OptimisticLockType.DIRTY) to just fail when another process changed the fields you changed, too.

Mucilage answered 14/2, 2014 at 18:43 Comment(7)
Just a small question sir? Let say in highly concurrent systems, is optimistic Locking a good idea? Consider a scenario of 100 threads, all read the same version and one of them updating the version would leave other 99 threads with an exception. That would mean 99% unavailability! How should we dealing with this aspect? Please help.Cantoris
@ShubhamKumar that can depend on what these 100 Threads are doing with the data. Sometimes you can devide the data into smaller pieces, sometimes you can eliminate the need of explicit locking by formulating an sql for the whole update e.g. increasing a counter and sometimes you might need real locks so the updates are just performed one after another. Optimistic locking works best, when usually only one thread will update each versioned entity, as otherwise you can have a thread infinitely retrying.Mucilage
What if all the threads, are trying to update the same versioned entity? Is it a good idea to have optimistic locking for the same?Cantoris
@ShubhamKumar Optimistic Locking will ensure, that threads don't overwrite the changes of another thread. If a thread is prevented to update the entry, it has to read the current state and reaply the modification. But it could fail again as there are up to 98 other threads doing the same. So if it is peak 100, usually 1 this might be ok, otherwise I would advice against it.Mucilage
In case of multiple tables to be updated would it be enough to use optimistic locking for each table + starting a transaction on the dB?Excerpt
@Excerpt that depends on transaction isolation and how the objects and their updates are related. If one process updates only table A and the other only table B at the same time, both can succeed. This may or may not be a problem. You could define a common version to be updated on any change to any of those tables and update this version in a transaction. This of course is less optimistic (i.e. longer, possibly blocking transactions), but can help.Mucilage
Well, I'm talking about tables with a relationships between them. So they will be updated on the same process (given an id).Excerpt
C
4

@TheConstructor: Great explanation of what it is and is not. When you said "Optimistic Locking is no magic to merge conflicting changes" I wanted to comment. I used to manage a DataFlex application which allowed users to edit records on a form. When they pushed the "Save" button, the app would do what was called a "multi-user re-read" of the data - pulling the current values - and compare to what the user had modified. If the fields the user modified had not been changed in the meantime, then just those fields would be written back to the record (which was locked only during the re-read + write operation) and therefore, 2 users could transparently modify different fields on the same record with no issues. It did not require version stamps, just a knowledge of which fields were modified.

Granted, this is not a perfect solution, but it did the job in that case. It was optimistic and it did allow unrelated changes and it gave the user an error on conflicting changes. This was the best that could be done, pre-SQL, but is still a good design principle today, perhaps for more object- or web-related scenarios.

Courser answered 13/11, 2014 at 19:9 Comment(3)
I don't have rep to comment on the actual question so: The example with Person and Color: Person has the Entity ID, so it is "What is being modified", the Colors table is related, not primary (you marked it as Foreign Key), so additions to it would never collide. Apparently you cannot delete rows from it even if a Person changes their Color? You can meaningfully only update one Entity at a time, otherwise you are getting in to the area of Cascaded Updates, etc. I think your model is strained. Make Color a Unique attribute of Person instead?Courser
Sometimes "multi-user re-read" is a good idea, sometimes it is not. E.g. you have a co-worker who Flags users who's mail return, so they can't place orders. Simultaniously the address could be changed, which normally leads to removal of the no-delivery-flag. In this scenario it can be best to reject the second edit and ask the user whether or not the change is valid upon the changed data. This can easily be done through optimistic locking.Mucilage
This is actually a good idea and could be layered on top of optimistic locking with a version field (i.e. if version field unchanged, do the update, if it is changed then see if we're modifying fields which were not modified since we originally pulled the record). The biggest practical challenge I see is that you have to know the prior values, e.g. you're editing v4, you go to save and the record is already at v6 [done and committed in the db], how do you know which fields were changed? Solving is possible but could be tricky. The form would probably have to submit old_value -> new_value.Wite

© 2022 - 2024 — McMap. All rights reserved.