MySQL InnoDB SELECT...LIMIT 1 FOR UPDATE Vs UPDATE ... LIMIT 1
Asked Answered
D

1

15

I have a table v_ext in a MySQL with InnoDB engine:
- id: primary key
- code: pre-generated list of codes (say 1000 codes are generated randomly)
- user_id: initially NULL

When a user purchase an item, they receive a code. I need to update the table to populate the user_id column. I have two options:

START TRANSACTION;
SELECT id FROM v_ext WHERE user_id IS NULL LIMIT 1 FOR UPDATE; -- return id 54 for ex.
UPDATE v_ext SET user_id=xxx WHERE id=54;
COMMIT;

or

UPDATE v_ext SET user_id=xxx WHERE user_id IS NULL LIMIT 1;

Is the second option safe if I have thousands of users purchasing at the same time? If so, is it correct to assume this second option is better for performance since it needs only one query?

Downbeat answered 26/12, 2012 at 11:18 Comment(8)
Is there a UNIQUE constraint on user_id? (i.e. can a user only have one code)?Augustusaugy
EXISTS would be a better use in terms of concurrency and performance.Vulcan
UNIQUE is not a constraint, a user_id can get as many code as they purchase itemDownbeat
@Vulcan could you elaborate? Do mean something like UPDATE v_ext SET user_id=xxx WHERE EXISTS (SELECT * FROM v_ext WHERE user_id IS NULL LIMIT 1) ?Downbeat
@Downbeat UPDATE v_ext SET user_id=xxx WHERE EXISTS (SELECT * FROM v_ext WHERE ID = 54 AND user_id IS NULL) but I believe more seasoned experts here could provide you with a better answer to compare which would be best in your case :) +1 for the interesting question focusing on performance.Vulcan
@Vulcan thanks. But I need to query to get the 54 id so it's 2 queries this way, and concurrency issues would raise. I'll try this route anyway when i get back to my officeDownbeat
@Downbeat Send ID as a variable in that case. Eager to know the outcome of this question.Vulcan
@Vulcan Trying both your and my SQL using the EXISTS result in error 1093 You can't specify target table for update in FROM clause. I am going to benchmark my two options and post the resultsDownbeat
D
18

Since I didn't get an answer, I started doing benchmarking. My criteria are as follows:

  • 20,000 pre-generated codes
  • Use of Apache ab command with 20,000 requests, 100 concurrency: ab -n 20000 -c 100
  • Servlet -> EJB (JPA 2.0 EclipseLink, JTA) to perform the update in DB (as it will be through a JSF action in real situation)
  • 2 versions of the Servlet, one with option 1 (SELECT ... FOR UPDATE ), and one with option 2 (UPDATE ... LIMIT 1)
  • Stopped Glassfish, hit the tested Servlet manually 5 times to warm it up, reset all to NULL to user_id
  • Tests are run 3 times each and average is provided

Results:

SELECT ... FOR UPDATE; UPDATE ... :

Concurrency Level:      100
Time taken for tests:   758.116 seconds
Complete requests:      20000
Failed requests:        0
Write errors:           0
Row updated:            20000

UPDATE.... LIMIT 1:

Concurrency Level:      100
Time taken for tests:   773.659 seconds
Complete requests:      20000
Failed requests:        0
Write errors:           0
Row updated:            20000

So at least on my system, the option with 2 queries seems more efficient than the one query. I didn't expect that :)

Downbeat answered 27/12, 2012 at 6:0 Comment(3)
+1 Interesting findings. Did it cause any concurrency issues?Ovarian
Not that I can see. I updated 20,000 rows through 20,000 successful requests.Downbeat
Follow up question with deadlock occurring only when I index user_id actually a varchar column) #14052538Downbeat

© 2022 - 2024 — McMap. All rights reserved.