Large celery task memory leak

Asked 20/9, 2013 at 22:34 Answered 5/11, 2019 at 15:51

Solved python django memory-leaks celery newrelic

I have a huge celery task that works basically like this:

 @task
 def my_task(id):
   if settings.DEBUG:
     print "Don't run this with debug on."
     return False

   related_ids = get_related_ids(id)

   chunk_size = 500

   for i in xrange(0, len(related_ids), chunk_size):
     ids = related_ids[i:i+chunk_size]
     MyModel.objects.filter(pk__in=ids).delete()
     print_memory_usage()

I also have a manage.py command that just runs my_task(int(args[0])), so this can either be queued or run on the command line.

When run on the command line, print_memory_usage() reveals a relatively constant amount of memory used.

When run inside celery, print_memory_usage() reveals an ever-increasing amount of memory, continuing until the process is killed (I'm using Heroku with a 1GB memory limit, but other hosts would have a similar problem.) The memory leak appears to correspond with the chunk_size; if I increase the chunk_size, the memory consumption increases per-print. This seems to suggest that either celery is logging queries itself, or something else in my stack is.

Does celery log queries somewhere else?

Other notes:

DEBUG is off.
This happens both with RabbitMQ and Amazon's SQS as the queue.
This happens both locally and on Heroku (though it doesn't get killed locally due to having 16 GB of RAM.)
The task actually goes on to do more things than just deleting objects. Later it creates new objects via MyModel.objects.get_or_create(). This also exhibits the same behavior (memory grows under celery, doesn't grow under manage.py).

Endarch answered 20/9, 2013 at 22:34 Comment(3)

Try using itertools.islice(related_ids, i, i + chunk_size) instead of related_ids[i:i+chunk_size]. It's probably not the only factor, but this might reduce some copying. – Extragalactic 20/9, 2013 at 22:38

Which Django version? Django 1.4’s QuerySet.delete always loads instances into memory before deleting them. I’d try replacing that with a raw SQL DELETE statement and see what happens. – Hebe 20/9, 2013 at 22:46

@VasiliyFaronov: Even after the objects go out of scope? Also, that doesn't explain why the memory usage is constant inside the manage.py command and not when within celery. – Endarch 20/9, 2013 at 22:48

This turned out not to have anything to do with celery. Instead, it was new relic's logger that consumed all of that memory. Despite DEBUG being set to False, it was storing every SQL statement in memory in preparation for sending it to their logging server. I do not know if it still behaves this way, but it wouldn't flush that memory until the task fully completed.

The workaround was to use subtasks for each chunk of ids, to do the delete on a finite number of items.

The reason this wasn't a problem when running this as a management command is that new relic's logger wasn't integrated into the command framework.

Other solutions presented attempted to reduce the overhead for the chunking operation, which doesn't help in an O(N) scaling concern, or force the celery tasks to fail if a memory limit is exceeded (a feature that didn't exist at the time, but might have eventually worked with infinite retries.)

Endarch answered 5/11, 2019 at 15:51 Comment(0)

A bit of necroposting, but this can help people in the future. Although the best solution should be tracking the source of the problem, sometimes this is not possible either because the source of the problem is outside of our control. In this case you can use the --max-memory-per-child option when spawning the Celery worker process.

Pharr answered 4/11, 2019 at 14:28 Comment(1)

This is indeed helpful for memory leaks. Just as a side note, this will not kill a running task if goes over the limit, it'll instead kill the worker after the task finishes. – Tranquil 30/4, 2021 at 13:37

Try Using the @shared_task decorator

Nidus answered 13/2, 2014 at 9:54 Comment(0)

You can although run worker with --autoscale n,0 option. If minimum number of pool is 0 celery will kill unused workers and memory will be released.

But this is not good solution.

A lot of memory is used by django's Collector - before deleting it collects all related objects and firstly deletes them. You can set on_delete to SET_NULL on model fields.

Another possible solution is deleting objects with limits, for example some objects per hour. That will lower memory usage.

Django does not have raw_delete. You can use raw sql for this.

Kashakashden answered 3/4, 2015 at 11:4 Comment(0)

Recommended topics

Hot tags