Unsatisfactory job push performance with Python RQ

Trying to use python-rq to support the back end to our Web application, but pushing new jobs takes very long - up to 12 seconds.

The performance hit happens when executing the enqueue_call function call, particularly when the number of worker processes connected to the system increases (over 200).

The system works as follows:

The front end pushes jobs to the task queue server. This uses the enqueue_call function to pass in arguments to the job (such as timeout and ttl), in addition to the actual arguments to the function to be executed.
Multiple processes (spread out over several machines) are running workers, each under a UNIX screen. The workers follow the pattern provided in the documentation, executing the Worker.work() infinite loop function to listen on the queues.
During processing, some of the tasks spawn new ones, usually on the same queue on which they are running.

About the infrastructure:

The Redis server that runs this task queue is dedicated to it. Also, persistence is disabled. It is running in a 4 GB Rackspace server.
When running redis-benchmark on the server with the task queue, we get results over 20000 r/s on average for most benchmarks.

How can we improve the push performance for new jobs in a situation like this? Is there a better pattern that we should use?

Recommended topics

Hot tags