How does Celery worker run the code defined elsewhere in a task?
Asked Answered
K

1

9

I tried reading official documentation as well as other SO threads, but it is still not clear how Celery works.

From what I understand:

  1. Django app: Celery is installed in Django (or any app) where @shared_task decorator function defines the work to be performed.
  2. Message broker: A message broker gets this task from 1. and queues it.
  3. Celery Worker: A completely separate Celery worker picks up the task and runs it. This worker can be in a completely different machine even, so long as it has access to the message broker.

So, then the burning question is:

How does the Celery worker get the code defined in @shared_task to run the task?

Basically, how does 3. get what's defined in 1. if they are only connected using a message broker? Is the python code stored in the message broker as string? What is the data structure of the message broker item/record?

Karakul answered 12/5, 2023 at 19:33 Comment(0)
K
10

After spending some time and actually implementing Celery, I was able to figure out what's actually going on. This is perhaps a short coming of my own ways of learning or the documentation overlooks it, either way, I want to leave it here in case others are also wondering the same.

The following is completely not obvious IMO:

  1. Django app: Celery tasks are defined here. It has Celery installed as dependency.

  2. Message broker: A message broker gets this task from 1. and queues it.

  3. Celery Worker: This is actually not a separate python app that only contains Celery. It is a full blown instance of 1.. So you launch Django twice, one is producing tasks and the other is working on them. Both containing the same @shared_task function.

Caveats that the documentation should explain:

  • If you update your main Django app and not restart Celery worker, your workers are working on a stale version of the @shared_task function.
  • The app in 1. is launched using Gunicorn (or whatever you want), but App in 3. is launched using celery -A "your_app" workers -l INFO. It's not running Gunicorn, but still contains all the code in 1.

I think right off the bat in the introduction of Celery, they should explain that both the producer and worker instances contain your source code.

Karakul answered 13/5, 2023 at 1:18 Comment(2)
Thank you @Neil. I was wondering the exact same thing. I came to the same conclusion as you but wasn’t sure if I was right. This has given me some confidence in it.Savoy
@ChristianCiach I think what happens is when a developer works on a library for an extended period of time, they get blind sighted by their own knowledge. The main developer shouldn't write the documentation. Bring someone new and then have them learn the library, internalize it and then have them write the docs. Most of the time this is unrealistic/expensive/impossible.Karakul

© 2022 - 2024 — McMap. All rights reserved.