Unsatisfactory job push performance with Python RQ
Asked Answered
E

1

13

Trying to use python-rq to support the back end to our Web application, but pushing new jobs takes very long - up to 12 seconds.

The performance hit happens when executing the enqueue_call function call, particularly when the number of worker processes connected to the system increases (over 200).

The system works as follows:

  1. The front end pushes jobs to the task queue server. This uses the enqueue_call function to pass in arguments to the job (such as timeout and ttl), in addition to the actual arguments to the function to be executed.
  2. Multiple processes (spread out over several machines) are running workers, each under a UNIX screen. The workers follow the pattern provided in the documentation, executing the Worker.work() infinite loop function to listen on the queues.
  3. During processing, some of the tasks spawn new ones, usually on the same queue on which they are running.

About the infrastructure:

  • The Redis server that runs this task queue is dedicated to it. Also, persistence is disabled. It is running in a 4 GB Rackspace server.
  • When running redis-benchmark on the server with the task queue, we get results over 20000 r/s on average for most benchmarks.

How can we improve the push performance for new jobs in a situation like this? Is there a better pattern that we should use?

Empyreal answered 11/3, 2013 at 21:57 Comment(2)
Did switching (to celery) improve performance significantly? I'm experiencing the same issue.Obelize
I'm not currently working on the same project anymore. However, we did continue to use rq as a back end, and performance problems eventually got ironed out as we cleaned up the infrastructure. I could not recommend switching over to celery or remaining with rq for your specific use case; I would, however, suggest running tests. It might not be as hard as you'd think initially, and I guarantee any time spent with creating good tests will pay back eventually, as you better understand the nature of your system. Best of luck!Empyreal
A
0

12 seconds? This is insane.

Have you considered using celery?
Never used redis-rq, but from what I see based on docs it is not really good for big numbers of workers
Redis queue usualy based on BLPOP command, which can work with multiple clients, but who knows how much it can really handle for one key.

So I suggest you to switch to Celery or writing your own tasks distributor for python-rq, which wont be easier then switching

Ailssa answered 24/3, 2013 at 1:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.