Celery tasks received but not executing
Asked Answered
D

7

40

I have Celery tasks that are received but will not execute. I am using Python 2.7 and Celery 4.0.2. My message broker is Amazon SQS.

This the output of celery worker:

$ celery worker -A myapp.celeryapp --loglevel=INFO
[tasks]
  . myapp.tasks.trigger_build

[2017-01-12 23:34:25,206: INFO/MainProcess] Connected to sqs://13245:**@localhost//
[2017-01-12 23:34:25,391: INFO/MainProcess] celery@ip-111-11-11-11 ready.
[2017-01-12 23:34:27,700: INFO/MainProcess] Received task: myapp.tasks.trigger_build[b248771c-6dd5-469d-bc53-eaf63c4f6b60]

I have tried adding -Ofair when running celery worker but that did not help. Some other info that might be helpful:

  • Celery always receives 8 tasks, although there are about 100 messages waiting to be picked up.
  • About once in every 4 or 5 times a task actually will run and complete, but then it gets stuck again.
  • This is the result of ps aux. Notice that it is running celery in 3 different processes (not sure why) and one of them has 99.6% CPU utilization, even though it's not completing any tasks or anything.

Processes:

$ ps aux | grep celery
nobody    7034 99.6  1.8 382688 74048 ?        R    05:22  18:19 python2.7 celery worker -A myapp.celeryapp --loglevel=INFO
nobody    7039  0.0  1.3 246672 55664 ?        S    05:22   0:00 python2.7 celery worker -A myapp.celeryapp --loglevel=INFO
nobody    7040  0.0  1.3 246672 55632 ?        S    05:22   0:00 python2.7 celery worker -A myapp.celeryapp --loglevel=INFO

Settings:

CELERY_BROKER_URL = 'sqs://%s:%s@' % (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY.replace('/', '%2F'))
CELERY_BROKER_TRANSPORT = 'sqs'
CELERY_BROKER_TRANSPORT_OPTIONS = {
    'region': 'us-east-1',
    'visibility_timeout': 60 * 30,
    'polling_interval': 0.3,
    'queue_name_prefix': 'myapp-',
}
CELERY_BROKER_HEARTBEAT = 0
CELERY_BROKER_POOL_LIMIT = 1
CELERY_BROKER_CONNECTION_TIMEOUT = 10

CELERY_DEFAULT_QUEUE = 'myapp'
CELERY_QUEUES = (
    Queue('myapp', Exchange('default'), routing_key='default'),
)

CELERY_ALWAYS_EAGER = False
CELERY_ACKS_LATE = True
CELERY_TASK_PUBLISH_RETRY = True
CELERY_DISABLE_RATE_LIMITS = False

CELERY_IGNORE_RESULT = True
CELERY_SEND_TASK_ERROR_EMAILS = False
CELERY_TASK_RESULT_EXPIRES = 600

CELERY_RESULT_BACKEND = 'django-db'
CELERY_TIMEZONE = TIME_ZONE

CELERY_TASK_SERIALIZER = 'json'
CELERY_ACCEPT_CONTENT = ['application/json']

CELERYD_PID_FILE = "/var/celery_%N.pid"
CELERYD_HIJACK_ROOT_LOGGER = False
CELERYD_PREFETCH_MULTIPLIER = 1
CELERYD_MAX_TASKS_PER_CHILD = 1000

Report:

$ celery report -A myapp.celeryapp

software -> celery:4.0.2 (latentcall) kombu:4.0.2 py:2.7.12
            billiard:3.5.0.2 sqs:N/A
platform -> system:Linux arch:64bit, ELF imp:CPython
loader   -> celery.loaders.app.AppLoader
settings -> transport:sqs results:django-db
Disgraceful answered 13/1, 2017 at 13:59 Comment(1)
Do module names match?Frigid
C
32

First install eventlet,


> pip install eventlet

and then run


> celery -A myapp.celeryapp worker --loglevel=info -P eventlet

Comply answered 26/3, 2021 at 21:48 Comment(8)
Please explain what this does?Taunt
IDK why this worked, but it did. Thanks. The broker sent messages but the worker did nothing. A side-effect of using this solution was that I also started seeing worker logs in my console. IDK if I'll face any issues during production deployment but at least I can work on my app now.Dreamadreamer
Worked for me. Not sure why, but thanks!Have
@alias51, @Hussain: It worked also in my case. After some research I found this article. Long story short: default concurrency pool prefork doesn't work on Windows.Ricketts
This also worked for me. Specifically, it showed me that the worker could not find redis and that I had to change the redis broker URL to redis://127.0.0.1:6379 (instead of redis://localhost:6379) in my settings.py.Abyssinia
awesome it worked for me, @myszon's point "Long story short: default concurrency pool prefork doesn't work on Windows. " helped me.Evincive
Took me two days solving this trivial thing. Turns out to be windows issue.Tiernan
currently (early 2024), as per celery.school/celery-on-windows, the options for pool implementations for running celery on windows are "solo", "threads", and "gevent". The eventlet pool did not work for me (using celery 5.3.65 on python 3.12)Bilbao
H
24

i think you are running celery in windows, try to add following parameter in your cmd:

-P solo

so new parameter will be as:

-A main worker --loglevel=info --queues=your_queue_name -P solo
Halverson answered 16/8, 2022 at 6:5 Comment(3)
this works, but what is different about windows that requires it?Taunt
sometime its not able to handle concurrency control for celeryHalverson
Thank you. My friend ran into this problem, and happened to be using w11. This additional flag worked for us.Patagonia
N
11

I was also getting same issue. After little bit for searching i found solution to add --without-gossip --without-mingle --without-heartbeat -Ofair to the Celery worker command line. So in your case your worker command should be celery worker -A myapp.celeryapp --loglevel=INFO --without-gossip --without-mingle --without-heartbeat -Ofair

Nickolenicks answered 12/8, 2018 at 10:35 Comment(6)
can you explain why?Anthropomorphic
@Anthropomorphic I explained what's (probably) happening in my answerMosesmosey
This didn't work for me. According to this post, I added -P solo to the command like: celery -A proj worker --loglevel=INFO --concurrency 1 -P soloWillodeanwilloughby
worked like a charm! Thank you so much, man!Decision
solo as an execution mode will work.. However, it doesn't execute jobs parallel and it is single-threaded. Please be aware of this in production mode. It just ignores that concurrency option..Quickman
Now that didn't work to me. I am using Windows 11, and I got the error: subprocess.CalledProcessError: Command 'ver' returned non-zero exit status 1.Vatican
M
6

Disabling worker gossip (--without-gossip) was enough to solve this for me on Celery 3.1. It looks like a bug causes this inter-worker communication to hang when CELERY_ACKS_LATE is enabled. Tasks are indeed received, but never acknowledged or executed. Stopping the worker returns them to the queue.

From the docs on gossip:

This means that a worker knows what other workers are doing and can detect if they go offline. Currently this is only used for clock synchronization, but there are many possibilities for future additions and you can write extensions that take advantage of this already.

So chances are you aren't using this feature anyways, and what's more it increases the load on your broker.

No time to investigate, but would be good to test this with the latest Celery and open an issue if it still occurs. Even if this behaviour is expected/unavoidable, that should be documented.

Mosesmosey answered 4/2, 2020 at 14:43 Comment(0)
H
3

I have the same issue. Vishnu's answer works for me. There is maybe another solution that doesn't require adding these extra parameter to worker command.

My issue is caused by importing other modules in the middle of task code. It seems celery fetch all used modules when you launch the worker and it only looks at the beginning of .py file. During running, it doesn't raise any error and just quit. After I move all "import" and "from ... import ..." to the beginning of code file, it works.

Hexastyle answered 19/4, 2019 at 1:5 Comment(0)
H
0

I have tried "-P solo" and it work for me on Windows machine. But if you want to run other options like "prefork", "processes",etc..then you can run it only on other system like WSL or Ubuntu.

Here is a sample I'm running prefork on Windows, task received but not be executed: enter image description here

and It's success on Wsl: enter image description here

Hysterectomize answered 23/12, 2023 at 12:41 Comment(0)
R
-1

celery -A core worker --loglevel=INFO --without-gossip --without-mingle --without-heartbeat -Ofair --pool=solo

Rollback answered 2/11, 2022 at 13:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.