How does Celery work?

Asked 10/1, 2016 at 14:34 Answered 12/12, 2022 at 16:26

multiprocessing celery worker celery-task broker

I have recently started working on distributed computing for increasing the computation speed. I opted for Celery. However, I am not very familiar with some terms. So, I have several related questions.

From the Celery docs:

What's a Task Queue?

...

Celery communicates via messages, usually using a broker to mediate between clients and workers. To initiate a task the client adds a message to the queue, the broker then delivers that message to a worker.

What are clients (here)? What is a broker? Why are messages delivered through a broker? Why would Celery use a backend and queues for interprocess communication?

When I execute the Celery console by issuing the command

celery worker -A tasks --loglevel=info --concurrency 5

Does this mean that the Celery console is a worker process which is in charge of 5 different processes and keeps track of the task queue? When a new task is pushed into the task queue, does this worker assign the task/job to any of the 5 processes?

Allure answered 10/1, 2016 at 14:34 Comment(0)

Last question first:

celery worker -A tasks --loglevel=info --concurrency 5

You are correct - the worker controls 5 processes. The worker distributes tasks among the 5 processes.

A "client" is any code that runs celery tasks asynchronously.

There are 2 different types of communication - when you run apply_async you send a task request to a broker (most commonly rabbitmq) - this is basically a set of message queues.

When the workers finish they put their results into the result backend.

The broker and results backend are quite separate and require different kinds of software to function optimally.

You can use RabbitMQ for both, but once you reach a certain rate of messages it will not work properly. The most common combination is RabbitMQ for broker and Redis for results.

Margheritamargi answered 11/1, 2016 at 11:37 Comment(3)

thanks for the brief info. you are saying RabbitMQ as broker and Redis as backend right. 1.) where does this memcached comes into the play. I have seen many forums using this as message queue. 2.) what if I have executed the above celery worker command in two different consoles and submitted a task from a interactive python..?? I mean, how do i specify use this particular worker console...?? – Allure 12/1, 2016 at 10:50

memcached can be used instead of Redis. Redis is probably the better choice. your second question makes no sense. please read the celery documentation again and then ask a new question in SO – Margheritamargi 12/1, 2016 at 10:53

It's not the clients that runs celery tasks, but the workers does. Client here is the producer that put job in the queue to be processed. – Harlanharland 20/4 at 8:18

We can take analogy of assembly line packaging in a factory to understand the working of celery.

Each product is placed on a conveyor belt.
The products are processed by machines.
At the end all the processed product is stored in one place one by one.

Celery working:

Note: Instead of taking each product for processing as they are placed on convey belt, In celery the queue is maintained whose output will be fed to a worker for execution one per task (sometimes more than one queue is maintained).

Each request (which is a task) is send to a queue (Redis/Rabbit MQ) and an acknowledgment is send back.
Each task is assigned to a specific worker which executes the task.
Once the worker has finished the task its output is stored in the result backend (Redis).

Anachronism answered 12/12, 2022 at 16:26 Comment(0)

What's a Task Queue?

Recommended topics

Hot tags