What determines the locking order for a multi-table query?
Asked Answered
D

2

13

Does the SQL standard specify the locking order for a multi-table query?

For example, given:

SELECT department.id FROM permissions, terminals, departments WHERE 
       department.id = ? AND 
       terminal.id = ? AND 
       permissions.parent = department.id AND 
       permissions.child = terminals.id;
  1. Does the SQL standard guarantee a locking order or is it determined by the (implementation-specific) execution plan?
  2. Is there a way to guarantee a locking order?
  3. If there is no way to guarantee locking order, how are we supposed to prevent deadlocks?

UPDATE: Please do not vote to close this issue without explaining your reasoning. As far as I'm concerned, this is a programming question, which makes it very much on-topic for Stackoverflow. If you believe the question needs to be further refined, please explain and I will be more than happy to answer you.

Dardan answered 14/5, 2013 at 20:9 Comment(11)
SELECT queries do not generate locks that result in deadlocks. Can you rephrase your question so it is relevant to a real situation?Realistic
@GordonLinoff, SELECT queries in READ_COMMITTED do generate locks (but for the duration of the statement). For other isolation levels (such as REPEATABLE_READ or SERIALIZABLE) they retain a lock until the end of the transaction. Caveat: some databases use MVCC which does not use any locks, but these are out of scope for this question.Dardan
Locks are an implementation detail. The isolation levels just specify phenomena that can/not occur. In SQL Server select queries at read committed mostly take S locks that are released as soon as the data is read (before the end the statement). Sometimes these locks can be kept until the statement ends however example and other times it doesn't take row level S locks at all. And given point 1 points 2 and 3 are unanswerable unless you specify a particular RDBMS.Valentin
Queries in READ_COMMITTED isolation level use a shared read lock. I expect that databases can acquire all the required locks in a single atomic operation. Commit may or may not prevent reads - I would expect most enterprise databases to be smart enough to avoid thisMinardi
@MartinSmith, please post an answer so we can elaborate on your points in more details.Dardan
So your question is about rdbms not using mvcc. But shouldn't the mere fact that some db use mvcc and other don't be a hint that it's not sql standard defined, and thus purely something to be aware and take care of with the relevant engines?Dissenter
Also, rdbms implementors must be very aware of the locking issue. Certainly engines are implemented to avoid the problem in trivial situations (like simple statements), whereas in transactions, the problem is a bit more complicated.Dissenter
@didierc, not necessarily. The standard could mandate that if locks are used, they must be established in a certain order. MVCC would avoid establishing locks altogether.Dardan
Yeah, I thought about that afterward. Regarding the title of your question, didn't you mean transaction rather than query?Dissenter
@didierc, no. We're talking about multiple transactions here. I'm talking about multiple threads querying the same tables, in a different order, at the same time, thereby causing a deadlock.Dardan
If you consider 2 queries both accessing table A and B, but such that both are composed of a main query on one table, and a subquery on the other table, and that the first uses A as its main table, and the second uses B. These two queries cannot be run together because of lock, and may not necessarily be rewritten to access tables in the same order. How can we ensure that both queries lock the tables in the same order? SQL does not provide the means to do that. It's up to the DB to take care of that.Dissenter
D
5

According to https://mcmap.net/q/282425/-zero-sql-deadlock-by-design-any-coding-patterns lock order is determined by the implementation-specific execution order. The answer further goes on to say that there isn't a deterministic way to prevent deadlocks. Whereas in imperative programming we can prevent deadlocks by acquiring locks in the same order, it seems that in declarative systems we have to work around them by retrying the operation when a deadlock is detected.

Furthermore, I argue that since database execution plans change over their lifetime it is technically impossible to prevent deadlocks.

Dardan answered 14/5, 2013 at 21:22 Comment(4)
Note that the phrase "when you access the tables in the same order" in the answer that you mentioned refers to the order of update statements. The order of tables in a join query, won't have much effect on the probability of a deadlock, because the query only creates S locks. Two transactions running the same query won't lock one another.Access
@nakosspy, two transaction querying the same set of tables but in a different order could cause a deadlock (at least, for a naive DB implementation). In other words, I agree with you that running the same query won't cause a deadlock, but the point is that even SELECT without an UPDATE can cause a deadlock.Dardan
What do you mean a naive database implementation? A DB that would acquire exclusive locks for a query? I don't know any database that does this. However in the case that you have to deal with such a "naive" database, then the major problem would be data integrity and performance, not deadlocks.Access
@nakosspy, my mistake. You are right. I didn't realize that "S locks" meant read locks. Yes, if you only ever SELECT then locking order does not matter (but realistically speaking, everyone needs to INSERT and UPDATE). The minute you introduce even a single statement that uses write-locks it has the potential of triggering a deadlock (meaning, a SELECT might be waiting on locks held by the UPDATE and vice-versa).Dardan
A
1

I can give you an answer for DB2, but I think that this should be similar for other databases as well. First of all, everything depends on the locksize parameter of your tables. This parameter defines what is being locked. You can have locksize = table, page or row. So, depending on locksize of each table, the database will lock the object (table, page or row) that is used to fetch data for the cursor. So the order of locks being created will be specified by the access path, which depends on the optimizer.

Access answered 14/5, 2013 at 20:44 Comment(2)
So you're saying it depends on the implementation-specific execution plan? If that's the case, what about question 3?Dardan
First of all these locks are S locks. This means that no other transaction is allowed to update the pages (or rows) used by the cursor. If you open a cursor and your isolation level is read stability, then nobody else is allowed to modify the data that your cursor reads. Your transaction is allowed to modify these data. If you want to minimize deadlocks then prefer row lock over page or table and set the isolation level to cursor stability.Access

© 2022 - 2024 — McMap. All rights reserved.