Logging to both thread-specific and shared log files in Python
Asked Answered
M

5

8

I want to implement per-thread logging in a multithreaded Python application using Python's logging module. I have appended a unique ID to the logger name in the main module (which creates threads):

mylogger = logging.getLogger(str(someInt) + __name__)

This module uses multiple modules which also support logging, but have their loggers initialized like:

mylogger = logging.getLogger(__name__)

Since the callee class doesn't see the logger of caller, the logs of caller are thread-specific but that of callee go in one global file, according to its path.

What can we do to avoid changing passing str(someInt) to each of the other modules and use them unchanged, but still log to thread-specific files?

Malacca answered 25/3, 2014 at 18:31 Comment(0)
T
17

It can be implemented by logging.Filter, here's an example:

import threading
import logging
import logging.config


class ThreadLogFilter(logging.Filter):
    """
    This filter only show log entries for specified thread name
    """

    def __init__(self, thread_name, *args, **kwargs):
        logging.Filter.__init__(self, *args, **kwargs)
        self.thread_name = thread_name

    def filter(self, record):
        return record.threadName == self.thread_name


def start_thread_logging():
    """
    Add a log handler to separate file for current thread
    """
    thread_name = threading.Thread.getName(threading.current_thread())
    log_file = '/tmp/perThreadLogging-{}.log'.format(thread_name)
    log_handler = logging.FileHandler(log_file)

    log_handler.setLevel(logging.DEBUG)

    formatter = logging.Formatter(
        "%(asctime)-15s"
        "| %(threadName)-11s"
        "| %(levelname)-5s"
        "| %(message)s")
    log_handler.setFormatter(formatter)

    log_filter = ThreadLogFilter(thread_name)
    log_handler.addFilter(log_filter)

    logger = logging.getLogger()
    logger.addHandler(log_handler)

    return log_handler


def stop_thread_logging(log_handler):
    # Remove thread log handler from root logger
    logging.getLogger().removeHandler(log_handler)

    # Close the thread log handler so that the lock on log file can be released
    log_handler.close()


def worker():
    thread_log_handler = start_thread_logging()
    logging.info('Info log entry in sub thread.')
    logging.debug('Debug log entry in sub thread.')
    stop_thread_logging(thread_log_handler)


def config_root_logger():
    log_file = '/tmp/perThreadLogging.log'

    formatter = "%(asctime)-15s" \
                "| %(threadName)-11s" \
                "| %(levelname)-5s" \
                "| %(message)s"

    logging.config.dictConfig({
        'version': 1,
        'formatters': {
            'root_formatter': {
                'format': formatter
            }
        },
        'handlers': {
            'console': {
                'level': 'INFO',
                'class': 'logging.StreamHandler',
                'formatter': 'root_formatter'
            },
            'log_file': {
                'class': 'logging.FileHandler',
                'level': 'DEBUG',
                'filename': log_file,
                'formatter': 'root_formatter',
            }
        },
        'loggers': {
            '': {
                'handlers': [
                    'console',
                    'log_file',
                ],
                'level': 'DEBUG',
                'propagate': True
            }
        }
    })


if __name__ == '__main__':
    config_root_logger()

    logging.info('Info log entry in main thread.')
    logging.debug('Debug log entry in main thread.')

    for i in xrange(3):
        t = threading.Thread(target=worker,
                             name='Thread-{}'.format(i),
                             args=[])
        t.start()

Console output:

$ python perThreadLoggingSample.py
2019-03-07 10:29:34,318| MainThread | INFO | Info log entry in main thread.
2019-03-07 10:29:34,320| Thread-0   | INFO | Info log entry in sub thread.
2019-03-07 10:29:34,320| Thread-1   | INFO | Info log entry in sub thread.
2019-03-07 10:29:34,320| Thread-2   | INFO | Info log entry in sub thread.
$

Check log files:

$ cat /tmp/perThreadLogging.log
2019-03-07 10:29:34,318| MainThread | INFO | Info log entry in main thread.
2019-03-07 10:29:34,319| MainThread | DEBUG| Debug log entry in main thread.
2019-03-07 10:29:34,320| Thread-0   | INFO | Info log entry in sub thread.
2019-03-07 10:29:34,320| Thread-1   | INFO | Info log entry in sub thread.
2019-03-07 10:29:34,321| Thread-0   | DEBUG| Debug log entry in sub thread.
2019-03-07 10:29:34,320| Thread-2   | INFO | Info log entry in sub thread.
2019-03-07 10:29:34,321| Thread-1   | DEBUG| Debug log entry in sub thread.
2019-03-07 10:29:34,321| Thread-2   | DEBUG| Debug log entry in sub thread.
$
$ cat /tmp/perThreadLogging-Thread-0.log
2019-03-07 10:29:34,320| Thread-0   | INFO | Info log entry in sub thread.
2019-03-07 10:29:34,321| Thread-0   | DEBUG| Debug log entry in sub thread.
$
$ cat /tmp/perThreadLogging-Thread-1.log
2019-03-07 10:29:34,320| Thread-1   | INFO | Info log entry in sub thread.
2019-03-07 10:29:34,321| Thread-1   | DEBUG| Debug log entry in sub thread.
$
$ cat /tmp/perThreadLogging-Thread-2.log
2019-03-07 10:29:34,320| Thread-2   | INFO | Info log entry in sub thread.
2019-03-07 10:29:34,321| Thread-2   | DEBUG| Debug log entry in sub thread.
$
Tiptop answered 7/3, 2019 at 2:38 Comment(3)
Can you go into a bit more detail on this solution? I see that log_file = '/tmp/perThreadLogging-{}.log'.format(thread_name) creates separate log files - how do these end up acquiescing into a single log file?Lacework
It does not end up acquiescing thread logs into a single log file. It write "/tmp/perThreadLogging.log" file and "/tmp/perThreadLogging-Thread-N.log" files at the same time. In config_root_logger() function, there are 2 root log handlers: console writes console screen, log_file writes main log file. When a worker() thread starts, a new log handler FileHandler is created by start_thread_logging(), it only writes current thread's logs to thread log file, at the same time, the 2 root log handlers are still working, they also write thread logs to console screen and the main log file.Tiptop
Is there any way to stop the 'current' thread from writing into the main log file?Orly
E
3

Although, I am not really experienced in what I am suggesting, I would try to use Thread-local storage.

Thread-local storage

A class that represents thread-local data. Thread-local data are data whose values are thread specific. To manage thread-local data, just create an instance of local (or a subclass) and store attributes on it. The instance’s values will be different for separate threads.

You can then save and access the variables in the following way:

import threading
mydata = threading.local()
mydata.x = 1

I suggest to save the thread-specific logger to the thread-local variable when you create it and then access the logger again when necessary through the thread-local storage.

Have a look, it helped me to understand thread-local:

Efik answered 25/3, 2014 at 18:48 Comment(2)
Thanks for the suggestion. That solves one part of the problem, i.e. passing information across the modules. But that would still require changes to the underlying modules to access the thread local variable and then use it as logger object.Malacca
Sorry, I didn't notice the "unchanged" part. Can you specify, what part of the code you can change? Or give us some simplified example, clearly stating which part can be modified.Efik
U
0

I wouldn't recommend a solution which has separate files for separate threads: you can always separate out the information from a common log file if you add thread / threadName to the format string. You can do what you're asking for, using a Filter subclass, as described in this post for a similar (but not identical) use case.

Unilobed answered 25/3, 2014 at 20:47 Comment(2)
Thanks for the suggestion Vinay. But in our current infrastructure, since there are 100's of threads which keep spawning and exiting, it becomes almost impossible to search across log files for 1 thread ID/Name.Malacca
You can separate out the information from a common log file if you add thread / threadName to the format string. How do you do that on a per-thread basis?Lacework
C
-1

A potential solution would be to replace getLogger:

import logging
old_getlogger= logging.getLogger
def my_getlogger(name):
    return old_getlogger( (str(someInt) if name==__name__ else "") + name)
logging.getLogger= my_getlogger

You'll need to make sure someInt has the correct value though, which may be hard, depending on the structure of your code. @Marek 's suggestion of using thread local storage may be useful for that.

Centralize answered 25/3, 2014 at 19:52 Comment(2)
@goncaloppp: The main concern is to make it work without changing the underlining modules.Malacca
This does not change the underlying modules, it simply modifies logging's behaviour at runtime. It's a quick-and-dirty solution, though - if you want to do things properly, the link @VinaySajip provided looks more promisingCentralize
S
-2

As far as I can tell, this is actually an intractable problem in python's logging system. You can kind of hack a solution if you have access to every place a new thread is created and can fiddle with logging controls via logging handlers and filters. But as soon as your code calls some lib/code somewhere that wants to do threading itself, this solution fails, at least in terms of where that lib's threads will log messages to.

I asked a related question with more detail at

How do I log to different files from different threads in python?

and I also posted a comment exposing this issue to the blog by @Vinay Sajip mentioned above. I.e., as far as I can tell (and I could be wrong (!)), his solution does not solve the underlying issue here. Nothing really does unless some high level wizardry involving call stack tracing or somesuch is done to hack away at this basic shortcoming in threads + logging. The multiprocessing package does not solve this issue, either. Each of the Processes is still bound to the same root logger (which in my naivete, seems a bit odd/worrisome, btw.)

There would be a real fix if a thread could be queried for its parent thread, but it cannot.

Shoelace answered 8/1, 2019 at 19:56 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.