Race conditions in django
Asked Answered
P

6

51

Here is a simple example of a django view with a potential race condition:

# myapp/views.py
from django.contrib.auth.models import User
from my_libs import calculate_points

def add_points(request):
    user = request.user
    user.points += calculate_points(user)
    user.save()

The race condition should be fairly obvious: A user can make this request twice, and the application could potentially execute user = request.user simultaneously, causing one of the requests to override the other.

Suppose the function calculate_points is relatively complicated, and makes calculations based on all kinds of weird stuff that cannot be placed in a single update and would be difficult to put in a stored procedure.

So here is my question: What kind of locking mechanisms are available to django, to deal with situations similar to this?

Pilocarpine answered 23/6, 2009 at 1:53 Comment(3)
On first pass, it looks like you need database level locking on the row in question at that point. I would consult the SQL documentiation for your database and send a custom query to do it.Possum
I would prefer a "database-agnostic" solution if it is at all possible.Pilocarpine
@transaction.commit_on_success + QuerySet.select_for_update()Kevenkeverian
H
55

Django 1.4+ supports select_for_update, in earlier versions you may execute raw SQL queries e.g. select ... for update which depending on underlying DB will lock the row from any updates, you can do whatever you want with that row until the end of transaction. e.g.

from django.db import transaction

@transaction.commit_manually()
def add_points(request):
    user = User.objects.select_for_update().get(id=request.user.id)
    # you can go back at this point if something is not right 
    if user.points > 1000:
        # too many points
        return
    user.points += calculate_points(user)
    user.save()
    transaction.commit()
Hy answered 11/6, 2012 at 20:41 Comment(4)
Looks like there was a patch for a long time for this feature code.djangoproject.com/ticket/2705 - I recently applied it to Django 1.3.5 (for a large project, which is hard to migrate to 1.4)Snobbery
I wondering how this is best implemented as a method of the User class (to be reusable in other places, not just in that view). The problem for me is that calling code must still make select_for_update() call, but I'd like it to be incapsulated in the user's method.Acroter
@IvanVirabyan either add a specific method to User class e.g. get_user but if you want to be more generic and want to override all objects queries write a custom ModelManagerHy
Note that Django 1.4's select for update will lock against rows from all tables in the query (SQL lets you specify a subset of table) - see groups.google.com/forum/#!topic/django-users/p1qnpz-S9xA. Good article on this approach, written before select_for_update() made it into Django 1.4 - coderanger.net/2011/01/select-for-updateNichy
T
23

As of Django 1.1 you can use the ORM's F() expressions to solve this specific problem.

from django.db.models import F

user = request.user
user.points  = F('points') + calculate_points(user)
user.save()

For more details see the documentation:

https://docs.djangoproject.com/en/1.8/ref/models/instances/#updating-attributes-based-on-existing-fields

https://docs.djangoproject.com/en/1.8/ref/models/expressions/#django.db.models.F

Topee answered 23/12, 2009 at 22:40 Comment(3)
The F() expressions still don't allow you to add a conditional on the update. So you could say increase the users points if they are still active.Intrusive
nope...this would fail if you have update inside a for loop!Incommensurable
You also can use F() with update: User.objects.filter(id=user.id).update(points=F('points') + points)Ferdelance
A
8

You have many ways to single-thread this kind of thing.

One standard approach is Update First. You do an update which will seize an exclusive lock on the row; then do your work; and finally commit the change. For this to work, you need to bypass the ORM's caching.

Another standard approach is to have a separate, single-threaded application server that isolates the Web transactions from the complex calculation.

  • Your web application can create a queue of scoring requests, spawn a separate process, and then write the scoring requests to this queue. The spawn can be put in Django's urls.py so it happens on web-app startup. Or it can be put into separate manage.py admin script. Or it can be done "as needed" when the first scoring request is attempted.

  • You can also create a separate WSGI-flavored web server using Werkzeug which accepts WS requests via urllib2. If you have a single port number for this server, requests are queued by TCP/IP. If your WSGI handler has one thread, then, you've achieved serialized single-threading. This is slightly more scalable, since the scoring engine is a WS request and can be run anywhere.

Yet another approach is to have some other resource that has to be acquired and held to do the calculation.

  • A Singleton object in the database. A single row in a unique table can be updated with a session ID to seize control; update with session ID of None to release control. The essential update has to include a WHERE SESSION_ID IS NONE filter to assure that the update fails when the lock is held by someone else. This is interesting because it's inherently race-free -- it's a single update -- not a SELECT-UPDATE sequence.

  • A garden-variety semaphore can be used outside the database. Queues (generally) are easier to work with than a low-level semaphore.

Allies answered 23/6, 2009 at 2:28 Comment(1)
Great answer. Somehow access to the database row has to be serialized and I think queues are more scalable than locks. @Fragsworth: see this project for a simple to use implementation of queues in Django that uses RabbitMQ: ask.github.com/celery/introduction.htmlSalzburg
K
8

Database locking is the way to go here. There are plans to add "select for update" support to Django (here), but for now the simplest would be to use raw SQL to UPDATE the user object before you start to calculate the score.


Pessimistic locking is now supported by Django 1.4's ORM when the underlying DB (such as Postgres) supports it. See the Django 1.4a1 release notes.

Komarek answered 23/6, 2009 at 3:9 Comment(0)
R
1

This may be oversimplifying your situation, but what about just a JavaScript link replacement? In other words when the user clicks the link or button wrap the request in a JavaScript function which immediately disables / "greys out" the link and replaces the text with "Loading..." or "Submitting request..." info or something similar. Would that work for you?

Raucous answered 23/6, 2009 at 3:15 Comment(3)
-1 it still does not protect the site. time to time users are using other http clients than browsers. i.e. user might use wget to fetch given URL, then disabling URL by jscript won't save you. Jscript should be used just to make page user friednly if you want to, but you should not use it to fix problems within server side application.Unpen
@SashaN: The poster didn't say that this wouldn't only be accessed through a web browser. We can't immediately assume all other exception cases like wget. I also prefixed the answer with "This may be oversimplifying your situation..." to cover the exception cases, as this suggestion may well be a suitable solution for many. Think also of future viewers of this question who may have a slightly different scenario in which this answer might be just the ticket. I certainly don't accept that it deserves a "not helpful" vote, but I do appreciate you at least providing a reason.Raucous
"Thou Shall Not Trust The Client Side"Bashuk
I
0

Now, you must use:

Model.objects.select_for_update().get(foo=bar)
Iloilo answered 24/6, 2014 at 8:0 Comment(1)
An explaination of your intention would be improve your answer.Headmaster

© 2022 - 2025 — McMap. All rights reserved.