Why does celery need a message broker?
Asked Answered
S

1

22

As celery is a job queue/task queue, name illustrates that it can maintain its tasks and process them. Then why does it need a message broker like rabbitmq or redis?

Subsidy answered 17/6, 2018 at 5:33 Comment(2)
distributed task queueRafferty
can you explain distributed task queue?Subsidy
R
26

Celery is a Distributed Task Queue that means that the system can reside across multiple computers (containers) across multiple locations with a single centralise bus

the basic architecture is as follows:

workers - processes that can take jobs (data) from the bus (task queue) and process it

*it can put the result back into the bus for farther processing by a different worker (create a processing flow)

bus - task queue, this is basically a db that store the jobs as messages, so the workers can retrieve them,

it's important to implement a concurrent and non blocking db, so when one process takes or puts job from/on the bus, it doesn't block other workers from getting/putting theirs jobs.

RabbitMQ, Redis, ActiveMQ Kafka and such are best candidates for this sort of behaviour

the bus has an api which let to submit jobs for workers and retrieve them (among more complex features)

most buses implement an ack/fail feature so workers can ack their job being done or if not ack (or report failure) this message can be served again to another worker, and might get processed successfully this time, thus no data is lost...(this depends highly on the fail over logic and the context of data as an input to a task)

Celery include a scheduler (beat) that periodically put specific jobs on the bus and thus create a periodically tasks

lets work with a scrapping example, you want to scrap the world, but china can only allow traffic from it's region and so is Europe and the USA so you can build a workers and place them all over the world

you can use only one bus, lets say it's located in the usa, all other workers know this bus and can connect to it, so by placing a specific job (scrap china) on the bus located in the US, a process in china can work on it, hence distributed

of course, workers will increase the throughput of the system, only due to parallelism, unrelated to their geo location and this is the common case of using an event-driven architecture (i.e central bus, consumers and producers)

I suggest read the formal docs, it's pretty straight forward

Rafferty answered 17/6, 2018 at 8:32 Comment(6)
I have a question, I don't understand what is the difference between an rabbitmq queue and an celery queue, are they the same thing or are they different? Tnx TomGeorgianngeorgianna
@TomislavMikulin yes and no, celery mange a queue of jobs, the implementation of this queue is depends on the broker type, it might implement it with a rabbitmq queue or redis list or even some other data structure depends on the broker and it's apiRafferty
when we use celery in the context of rabbitmq, the "bus - task queue" as you mentioned in your answer is it an rabbitmq queue or an celery queue?Georgianngeorgianna
The bus is of rabbit (queue), celery expose a transparent api i.e add to queue, take from queue and suchRafferty
according to your comment, can we say that celery makes message broker generic? it enables to choose any other broker. Right?Submersed
@Rafferty Question was about Celery and RabbitMQ. From answer it is not clear when we use RabbitMQ. I've added RabbitMQ into answer. Please let me know if it is correct or not.Chrischrism

© 2022 - 2024 — McMap. All rights reserved.