I think there are two different approaches here:
- A task manager (like Celery)
- An async implementation (like gevent)
What you achieve with each of them is different. With Celery, what you can do is to run all the code you need to compute the response synchronously, and then run in the background any other operation (like saving to logs). This way, the response should be faster.
With gevent, what you achieve, is to run in parallel different instances of your handler. So, if you have a single request, you won't see any difference in the response time, but if you have thousands of concurrent requests, the performance will be much better. The reason for this, is that without gevent, when your code executes an IO operation, it blocks the execution of that process, while with gevent, the CPU can go on executing other requests while the IO operation waits.
Setting up gevent is much easier than setting up Celery. If you're using gunicorn, you simply install gevent and change the worker type to gevent. Another advantage is that you can parallelize any operation that is required in the response (like extracting the response from a database). In Celery, you can't use the output of the Celery task in your response.
What I would recommend, is to start by using gevent, and consider to add Celery later (and have both of them) if:
- The output of the task you will process with Celery is not required in the response
- You have a different machine for your celery tasks, or the usage of your server has some peaks and some idle time (if your server is at 100% the whole time, you won't get anything good from using Celery)
- The amount of work that your Celery tasks will do, are worth the overhead of using Celery