Size limit on Celery task arguments?
Asked Answered
S

1

12

We have a Celery task that requires a Pandas dataframe as an input. The dataframe is first serialized to JSON and then passed as an argument into the task. The dataframes can have around 35 thousand entries, which results in a JSON dictionary occupying about 700kB. We are using Redis as a broker.

Unfortunately the call to delay() on this task often takes too long (in excess of thirty seconds), and our web requests time out.

Is this the kind of scale that Redis and Celery should be able to handle? I presumed it was well within limits and the problem lies elsewhere, but I can't find any guidance or experience on the internet.

Supplant answered 2/8, 2018 at 14:49 Comment(0)
V
3

I would suggest to save the json into your database and pass the id to the celery task instead of the whole json.

class TodoTasks(models.Model):
    serialized_json = models.TextField()

Moreover, you can keep record of the status of the task with a few fields and even keep error (which I find very usefull for debugging) :

import traceback
from django.db import models

class TodoTasks(models.Model):
    class StatusChoices(models.TextChoices):
        PENDING = "PENDING", "Awaiting celery to process the task"
        SUCCESS = "SUCCESS", "Task done with success"
        FAILED = "FAILED", "Task failed to be processed"

    serialized_json = models.TextField()

    status = models.CharField(
        max_length=10, choices=StatusChoices.choices, default=StatusChoices.PENDING
    )
    created_date = models.DateTimeField(auto_now_add=True)
    processed_date = models.DateTimeField(null=True, blank=True)
    error = models.TextField(null=True, blank=True)

    def handle_exception(self):
        self.error = traceback.format_exc()
Vivl answered 8/8, 2022 at 11:8 Comment(1)
Interesting read about catching exception in Celery : hacksoft.io/blog/…Vivl

© 2022 - 2024 — McMap. All rights reserved.