Is Django post_save signal asynchronous?
Asked Answered
H

4

55

I have a like function which is just like social networks like or thumbs up function; the user clicks the star / heart / whatever to mark the content as liked.It is done with ajax and must be fast.

The only problem here is that for some reasons I have to do some tasks for each like and I found out they were coded straight in the like view and it makes it slow.

I am thinking of using signals to make the execution of these tasks asynchronous so the view can send back the json right away to the javascript without waiting for the tasks to finish.

I started creating a signal for the like but then realized that Django's signals were not asynchronous and it would end up the same, the view would have to wait for the signal to finish to send back its response.

So I could try to make that signal asynchronous as it is explained here and there but I would as well use the post_save signal for the like model but now I wonder if the view can finish before the signal gets executed?

Hickey answered 10/8, 2012 at 9:43 Comment(1)
To answer directly: No. It's sync.Psychotherapy
C
32

What you want is a thread. They're very easy to use. You just subclass threading.Thread and write a run method:

import threading

class LikeThread(threading.Thread):
    def __init__(self, user, liked, **kwargs):
        self.user = user
        self.liked = liked
        super(LikeThread, self).__init__(**kwargs)

    def run(self):
        # long running code here

Then, when your ready to do the task, you fire it off with:

LikeThread(request.user, something).start()

The rest of your view code or whatever will resume and return the response, and the thread will happily do its work until it's done and then end itself.

See full documentation: http://docs.python.org/library/threading.html

Chippewa answered 10/8, 2012 at 14:53 Comment(7)
This is very interesting; but I am under the impression that threading under python is limited by GIL?Tejada
Well, of course. It's not an issue if you're not trying to mutate anything. If you're just sending an email with the data, etc. If you need to actually modify anything and there could be potential thread-safety issues, then you need to employ judicious use of locks. Still, if you need to off-load a long running process, you really have no choice.Chippewa
What about using the more expensive multiprocessing then?Tejada
Multiprocessing is just threads on steroids. You can use it, but unless the code you're running merits multiple cores working on it (Hint: you'd need to be doing some serious scientific processing with large datasets and such or extremely graphics intensive work), using multiprocessing is a waste of time.Chippewa
Cool! That clears up a lot about my understanding of the python multiprocessing and python threading modules. Thanks Chris!Tejada
I understand this is an older question/answer, but I believe that currently celery should be the accepted answer (as I've stated below in the celery answer). Introducing multithreading in your codebase can possible introduce a whole host of problems. celery has been specifically designed to fix the problem the OP has.Control
@RemcoWendt Just wondering what kind of problems one could face? I've a very similar use case where after a model is saved, publish a message to Kafka. Since I don't want HTTP request to wait until message is published, this seems like an elegant solution.Berretta
A
54

Also look into celery (or more specifically django-celery). It is an async task scheduler / handler. So your post_save signal handler creates a task, which is picked up and executed through celery. That way you still have your speedy application, while the heavy lifting is performed async, even on a different machine or batch of machines.

Artur answered 11/8, 2012 at 13:57 Comment(3)
I believe this should now be the accepted answer! I would always advice against introducing multithreading in a codebase if other options are available, the code posted by Chris indeed shows the easy tip of the iceberg, but doesn't mention the whole host of possible issues when introducing multithreading.Control
@Symon I’ve stopped recommending celery a long time ago. The last time I’ve suggested looking into celery is about 9 years ago. If you’re so opposed to this solution, at least share with the readers why you think they should not use celery. Also, please, STOP telling people what (not) to do. Your comment reads as an attack on me personally, instead of offering any insights. Just offer your opinion and leave it to the reader to determine what best fits their need. Even better, offer a solution and explain why it is better than the other solutions presented.Artur
I'm interested to hear more if/why celery is a notably worse option in 2021 than it was when this answer was originally written.Prudential
C
32

What you want is a thread. They're very easy to use. You just subclass threading.Thread and write a run method:

import threading

class LikeThread(threading.Thread):
    def __init__(self, user, liked, **kwargs):
        self.user = user
        self.liked = liked
        super(LikeThread, self).__init__(**kwargs)

    def run(self):
        # long running code here

Then, when your ready to do the task, you fire it off with:

LikeThread(request.user, something).start()

The rest of your view code or whatever will resume and return the response, and the thread will happily do its work until it's done and then end itself.

See full documentation: http://docs.python.org/library/threading.html

Chippewa answered 10/8, 2012 at 14:53 Comment(7)
This is very interesting; but I am under the impression that threading under python is limited by GIL?Tejada
Well, of course. It's not an issue if you're not trying to mutate anything. If you're just sending an email with the data, etc. If you need to actually modify anything and there could be potential thread-safety issues, then you need to employ judicious use of locks. Still, if you need to off-load a long running process, you really have no choice.Chippewa
What about using the more expensive multiprocessing then?Tejada
Multiprocessing is just threads on steroids. You can use it, but unless the code you're running merits multiple cores working on it (Hint: you'd need to be doing some serious scientific processing with large datasets and such or extremely graphics intensive work), using multiprocessing is a waste of time.Chippewa
Cool! That clears up a lot about my understanding of the python multiprocessing and python threading modules. Thanks Chris!Tejada
I understand this is an older question/answer, but I believe that currently celery should be the accepted answer (as I've stated below in the celery answer). Introducing multithreading in your codebase can possible introduce a whole host of problems. celery has been specifically designed to fix the problem the OP has.Control
@RemcoWendt Just wondering what kind of problems one could face? I've a very similar use case where after a model is saved, publish a message to Kafka. Since I don't want HTTP request to wait until message is published, this seems like an elegant solution.Berretta
C
8

Hm, first of all signals in Django are not asynchronous. For your particular case I think post_save is the wrong way to go. The most straightforward way is simply to fire an ajax request to view which do your like action and don't wait for the response. Instead modify your view/html directly after you fired the request.

That would of course require that you know beforehand that your user is allowed to like this item and that your request will not fail.

Chiropractor answered 10/8, 2012 at 9:59 Comment(3)
I'm not a big fan of presuming what's going to happen, that's a dirty trick to me ;) Chris' answer convinces me better.Hickey
Using a Thread is the same in my opinion. You will spawn the thread, continue your processing in your frontend and don't know if the thread is going to fail or succeed. I have my fair share using threads in Django and in the end it was never a good idea (resource wise, from a debug point of view, etc.). If bad you end up with a lot of zombie threads in your system. If you want async behavior use task handlers like celery. Another option a green threads with gevent or alike which are less expensive.Chiropractor
Good point indeed. In my case if the thread fails that's not as bad. What I want here is to dissociate the tasks that need to be fast and confirmed from the tasks that can take up more time and fail silently. And with the threads solution I don't have to compromise checking if the user is allowed to like, I can also send back a confirmation to the javascript and not assuming it worked. Then the other tasks like sending emails and other notifications should have no reason to fail but I'll follow your advice and check celery and the likes.Hickey
A
4

The async-signals package (https://github.com/nyergler/async-signals) abstracts this issue. You call an async signal function; if Celery is present the package uses it to issue the signal asynchronously from a worker; and if Celery is not available the package sends the signal in the traditional synchronous way.

Apocryphal answered 13/5, 2014 at 22:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.