Django Celery Logging Best Practice
Asked Answered
C

4

82

I'm trying to get Celery logging working with Django. I have logging set-up in settings.py to go to console (that works fine as I'm hosting on Heroku). At the top of each module, I have:

import logging
logger = logging.getLogger(__name__)

And in my tasks.py, I have:

from celery.utils.log import get_task_logger
logger = get_task_logger(__name__)

That works fine for logging calls from a task and I get output like this:

2012-11-13T18:05:38+00:00 app[worker.1]: [2012-11-13 18:05:38,527: INFO/PoolWorker-2] Syc feed is starting

But if that task then calls a method in another module, e.g. a queryset method, I get duplicate log entries, e.g.

2012-11-13T18:00:51+00:00 app[worker.1]: [INFO] utils.generic_importers.ftp_processor process(): File xxx.csv already imported. Not downloaded
2012-11-13T18:00:51+00:00 app[worker.1]: [2012-11-13 18:00:51,736: INFO/PoolWorker-6] File xxx.csv already imported. Not downloaded

I think I could use

CELERY_HIJACK_ROOT_LOGGER = False

to just use the Django logging but this didn't work when I tried it and even if I did get it to work, I would lose the "PoolWorker-6" bit which I do want. (Incidentally, I can't figure out how to get the task name to display in the log entry from Celery, as the docs seems to indicate that it should).

I suspect I'm missing something simple here.

Coneflower answered 13/11, 2012 at 18:14 Comment(4)
Related, perhaps: Celery and Python's logging inside tasksNestornestorian
I did look at that. The get_task_logger() bit is apparently the newer way of doing the add.get_logger() bitConeflower
Ah, indeed, it appears my answer is outdated for Celery 3.0; since I don't use Celery at the moment I wasn't aware of that. :-)Nestornestorian
Celery 3.0 uses two logger hierarchies, there's the celery logger which all other loggers inherits from (you can create a new one with celery.utils.get_logger, and there's the celery.task logger, which also inherits from the celery logger but does not propagate to its handlers, this is because it has a custom logging format (it includes the task id and so on). If you set up logging manually you should configure them both, with a custom logger format for celery.taskWassyngton
G
85

When your logger initialized in the beginning of "another module" it links to another logger. Which handle your messages. It can be root logger, or usually I see in Django projects - logger with name ''.

Best way here, is overriding your logging config:

LOGGING = {
    'version': 1,
    'disable_existing_loggers': True,
    'formatters': {
        'simple': {
            'format': '%(levelname)s %(message)s',
             'datefmt': '%y %b %d, %H:%M:%S',
            },
        },
    'handlers': {
        'console': {
            'level': 'DEBUG',
            'class': 'logging.StreamHandler',
            'formatter': 'simple'
        },
        'celery': {
            'level': 'DEBUG',
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': 'celery.log',
            'formatter': 'simple',
            'maxBytes': 1024 * 1024 * 100,  # 100 mb
        },
    },
    'loggers': {
        'celery': {
            'handlers': ['celery', 'console'],
            'level': 'DEBUG',
        },
    }
}

from logging.config import dictConfig
dictConfig(LOGGING)

In this case I suppose it should work as you assume.

P.S. dictConfig added in Python2.7+.

Greenebaum answered 17/11, 2012 at 10:53 Comment(5)
I need the celery logging to go to the console rather than log files. Will that work with this?Coneflower
Sure, streamHandler will write to console.Greenebaum
you could even set a max nr of logfiles, add after maxBytes the property in the dict: 'backupCount': 10,Keheley
what is the reason for adding dictConfig ? wouldn't the LOGGING settings of django suffice ?Begone
Here it says that django uses dictconfig by default docs.djangoproject.com/en/4.1/topics/logging/…Begone
B
12

To fix duplicate logging issue, what worked for me is to set the propagate setting to false when declaring my settings.LOGGING dict

LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'handlers': {
        'console': {
            'level': 'DEBUG',
            'class': 'logging.StreamHandler',
            'formatter': 'verbose'
        },
    },
    'formatters': {
        'verbose': {
            'format': '%(asctime)s %(levelname)s module=%(module)s, '
            'process_id=%(process)d, %(message)s'
        }
    },
    'loggers': {
        'my_app1': {
            'handlers': ['console'],
            'level': 'DEBUG',
            'propagate': False #this will do the trick
        },
        'celery': {
            'handlers': ['console'],
            'level': 'DEBUG',
            'propagate': True
        },
    }
}

lets say your django project layout looks like:
my_project/
- tasks.py
- email.py

and lets say one of your tasks makes a call to some function in email.py; the logging will happen in email.py and then that logging will get propagated to the 'parent' which in this case happens to be your celery task. Thus double logging. But setting propagate to False for a particular logger means that for that logger/app, its logs wont get propagated to the parent, hence their will be no 'double' logging. By default 'propagate' is set to True

Here's a link to the django docs section about that parent/children loggers stuff

Bonnes answered 6/7, 2015 at 17:8 Comment(0)
L
11

Maybe it will help someone, my problem was to send all celery logs to graylog. Here is solution.

celery.py:

app.config_from_object('django.conf:settings', namespace='CELERY')


# ====== Magic starts
from celery.signals import setup_logging

@setup_logging.connect
def config_loggers(*args, **kwargs):
    from logging.config import dictConfig
    from django.conf import settings
    dictConfig(settings.LOGGING)
# ===== Magic ends


# Load task modules from all registered Django app configs.
app.autodiscover_tasks()

settings.py:

LOGGING = {
    'version': 1,
    'handlers': {
        'graypy': {
            'class': 'graypy.GELFTCPHandler',
            'host': GRAYLOG_HOST,
            'port': GRAYLOG_PORT,
        }
    },
    'loggers': {
        'my_project': {
            'handlers': ['graypy'],
            'level': 'INFO',
        },
        # ====== Magic starts
        'celery': {
            'handlers': ['graypy'],
            'level': 'INFO',
        }
        # ===== Magic ends
    }
}
Longstanding answered 2/6, 2020 at 19:34 Comment(0)
U
9

It is troubling that Celery interferes with the root logger (which is not best practice and can't be controlled completely), but it does not disable your app's custom loggers in any way, so use your own handler names and define your own behavior rather than trying to fix this issue with Celery. [I like to keep my application logging separate anyway). You could use separate handlers or the same for Django code and Celery tasks, you just need to define them in your Django LOGGING config. Add formatting args for module, filename, and processName to your formatter for sanity, to help you distinguish where messages originate.

[this assumes you have setup a handler for 'yourapp' in the LOGGING settings value that points to an Appender - sounds like you are aware of this though].

views.py

log = logging.getLogger('yourapp')
def view_fun():
    log.info('about to call a task')
    yourtask.delay()

tasks.py

log = logging.getLogger('yourapp')
@task
def yourtask():
    log.info('doing task')

For the logging that Celery generates - use the celeryd flags --logfile to send Celery output (eg, worker init, started task, task failed) to a separate place if desired. Or, use the other answer here that sends the 'celery' logger to a file of your choosing.

Note: I would not use RotatingFileHandlers - they are not supported for multi-process apps. Log rotation from another tool like logrotate is safer, same goes with logging from Django assuming you have multiple processes there, or the same log files are shared with the celery workers. If your using a multi-server solution you probably want to be logging somewhere centralized anyway.

Used answered 22/11, 2012 at 1:0 Comment(3)
The Celery logging is supposed to log the task name with the messages. Is there a way to get this working? So a log message from another module would include "yourtask" in the log message if it was called from "yourtask"Coneflower
Try log.info("Executing task id %r, args: %r kwargs: %r" % ( yourtask.request.id, yourtask.request.args, yourtask.request.kwargs)) inside your task function.Used
The reason Celery interferes with the root logger is because there are several 3rd party apps/libs out there that configures logging. In many cases this has resulted in no output for users. Celery users still have the option to configure logging manually.Wassyngton

© 2022 - 2024 — McMap. All rights reserved.