What you need is to set
CELERY_ACKS_LATE = True
Late ack means that the task messages will be acknowledged after the task has been executed,
not just before, which is the default behavior.
In this way if the worker crashes rabbit MQ will still have the message.
Obviously of a total crash (Rabbit + workers) at the same time there is no way of recovering the task, except if you implement a logging on task start and task end.
Personally I write in a mongodb a line every time a task start and another one when the task finish (independently form the result), in this way I can know which task was interrupted by analyzing the mongo logs.
You can do it easily by overriding the methods __call__
and after_return
of the celery base task class.
Following you see a piece of my code that uses a taskLogger class as context manager (with entry and exit point).
The taskLogger class simply writes a line containing the task info in a mongodb instance.
def __call__(self, *args, **kwargs):
"""In celery task this function call the run method, here you can
set some environment variable before the run of the task"""
#Inizialize context managers
self.taskLogger = TaskLogger(args, kwargs)
self.taskLogger.__enter__()
return self.run(*args, **kwargs)
def after_return(self, status, retval, task_id, args, kwargs, einfo):
#exit point for context managers
self.taskLogger.__exit__(status, retval, task_id, args, kwargs, einfo)
I hope this could help
CELERY_ACKS_LATE=True
[1] how does (if at all)Celery
ensure that sametask
is not picked up by multipleworker
s? [2] ifCelery
tasks should ideally beidempotent
, then what's the problem with them running multiple times? (for the 2nd ques, actually here they say that its okay, but I'm looking for an explicit affirmative) – Bravissimo