Why does MVCC require locking for DML statements

Asked 30/5, 2015 at 12:30 Answered 16/9, 2023 at 16:44

Solved postgresql transactions locking dml mvcc

In PostgreSQL, the MVCC concurrency control mechanism says that:

MVCC locks acquired for querying (reading) data do not conflict with locks acquired for writing data, and so reading never blocks writing and writing never blocks reading

So, even for READ_COMMITTED, an UPDATE statement will lock the currently affected rows so no other transaction can modify them, until the current transaction commits or rolls back.

If a concurrent transaction issues an UPDATE on the locked rows, the second transaction will block until the first one releases it's locks.

Is this behavior trying to prevent the write-write conflicts?
Lost updates can still happen in READ_COMMITTED, as after the first transaction commits, the second one will overwrite the row (even if the database has changed between the UPDATE query start and the query end). So if lost updates are still possible, why does the second transaction have to wait? Couldn't the row-level snapshots be used to store the uncommitted transaction changes to avoid transactions having to wait for write-locks to be released?

Staunch answered 30/5, 2015 at 12:30 Comment(0)

The answer to the first question is Yes. No DBMS can support dirty writes; if two transactions T1 and T2 are concurrently executing and T2 overwrites an update from T1, then the system cannot handle the case where T1 subsequently issues a ROLLBACK since T2's update has already occurred.

To avoid dirty writes, the original definition for snapshot isolation was "first committer wins" - that is, conflicting writes would be allowed to happen, but only the first transaction to issue a COMMIT would be able to - all other conflicting transactions would have to ROLLBACK. But this programming model is somewhat problematic, if not wasteful, since a transaction might update a significant proportion of the database only to be refused the ability to COMMIT at the end. So, instead of "first committer wins" most DBMS systems that support MVCC implement "first updater wins" using fairly traditional two-phase locking.

Orsino answered 30/5, 2015 at 13:59 Comment(3)

But dirty reads can't happen, since we are on READ_COMMITTED. I wad wondering why not update isolated copies and just flush them at commit time. Right now, in PostgreSQL, the 2nd tx simply overwrites the 1st one, but there's also a blocking going on. Why not just use session-scoped snapshots and no blocking at this isolation level. – Staunch 30/5, 2015 at 14:24

I think what you mean by "update isolated copies" will correspond to "first committer wins". And a system could implement that - but the programming model is odd, because every transaction has to be prepared to handle the case where every modification "worked" (ie. it did complete and did not return an error) but the transaction cannot be committed. – Orsino 30/5, 2015 at 15:7

I suppose the locking model is easier to adopt, because every database already has a locking hierarchy. As you mentioned before, it's not just the row data but all indexes too. – Staunch 30/5, 2015 at 16:7

So if lost updates are still possible, why does the second transaction have to wait? Couldn't the row-level snapshots be used to store the uncommitted transaction changes to avoid transactions having to wait for write-locks to be released?

Yes, lost updates are still possible, but to reproduce lost update situation (from non-repeatable read perspective) we need to have interaction between readers and writers from concurrent transactions. But what if we have two concurrent transactions and they both execute queries like:

UPDATE register SET total = total + 100 WHERE id = 1;

So, we have atomic operations where we have no room for interaction between reader and writer and we always expect that last transaction will try to apply its statement to the up-to-date data (and if the data changed but not committed we should wait) even in MVCC and regardless of isolation level. And as a result we get +200 from both transactions in READ COMMITTED level.

But without locking mechanism, we will end up having +100 as a result of both transactions and it will be similar to the lost update of write-write conflict that you mentioned. So writers always block writers to avoid problems like that.

Chicanery answered 16/9, 2023 at 16:44 Comment(0)

Recommended topics

Hot tags