Trying to use python-rq
to support the back end to our Web application, but pushing new jobs takes very long - up to 12 seconds.
The performance hit happens when executing the enqueue_call
function call, particularly when the number of worker processes connected to the system increases (over 200).
The system works as follows:
- The front end pushes jobs to the task queue server. This uses the
enqueue_call
function to pass in arguments to the job (such as timeout and ttl), in addition to the actual arguments to the function to be executed. - Multiple processes (spread out over several machines) are running workers, each under a UNIX
screen
. The workers follow the pattern provided in the documentation, executing theWorker.work()
infinite loop function to listen on the queues. - During processing, some of the tasks spawn new ones, usually on the same queue on which they are running.
About the infrastructure:
- The Redis server that runs this task queue is dedicated to it. Also, persistence is disabled. It is running in a 4 GB Rackspace server.
- When running
redis-benchmark
on the server with the task queue, we get results over 20000 r/s on average for most benchmarks.
How can we improve the push performance for new jobs in a situation like this? Is there a better pattern that we should use?
rq
as a back end, and performance problems eventually got ironed out as we cleaned up the infrastructure. I could not recommend switching over tocelery
or remaining withrq
for your specific use case; I would, however, suggest running tests. It might not be as hard as you'd think initially, and I guarantee any time spent with creating good tests will pay back eventually, as you better understand the nature of your system. Best of luck! – Empyreal