Multiline log records in syslog
Asked Answered
F

3

39

So I've configured my Python application to log to syslog with Python's SysLogHandler, and everything works fine. Except for multi-line handling. Not that I need to emit multiline log records so badly (I do a little), but I need to be able to read Python's exceptions. I'm using Ubuntu with rsyslog 4.2.0. This is what I'm getting:

Mar 28 20:11:59 telemachos root: ERROR 'EXCEPTION'#012Traceback (most recent call last):#012  File "./test.py", line 22, in <module>#012    foo()#012  File "./test.py", line 13, in foo#012    bar()#012  File "./test.py", line 16, in bar#012    bla()#012  File "./test.py", line 19, in bla#012    raise Exception("EXCEPTION!")#012Exception: EXCEPTION!

Test code in case you need it:

import logging
from logging.handlers import SysLogHandler

logger = logging.getLogger()
logger.setLevel(logging.INFO)
syslog = SysLogHandler(address='/dev/log', facility='local0')
formatter = logging.Formatter('%(name)s: %(levelname)s %(message)r')
syslog.setFormatter(formatter)
logger.addHandler(syslog)

def foo():
    bar()

def bar():
    bla()

def bla():
    raise Exception("EXCEPTION!")

try:
    foo()
except:
    logger.exception("EXCEPTION")
Faggoting answered 28/3, 2011 at 19:26 Comment(0)
F
36

OK, figured it out finally...

rsyslog by default escapes all weird characters (ASCII < 32), and this include newlines (as well as tabs and others).

$EscapeControlCharactersOnReceive:

This directive instructs rsyslogd to replace control characters during reception of the message. The intent is to provide a way to stop non-printable messages from entering the syslog system as whole. If this option is turned on, all control-characters are converted to a 3-digit octal number and be prefixed with the $ControlCharacterEscapePrefix character (being ‘\’ by default). For example, if the BEL character (ctrl-g) is included in the message, it would be converted to “\007”.

You can simply add this to your rsyslog config to turn it off:

$EscapeControlCharactersOnReceive off

or, with the "new" advanced syntax:

global(parser.escapeControlCharactersOnReceive="off")
Faggoting answered 4/4, 2011 at 20:46 Comment(6)
This will probably work for your case, but if you're writing a high-volume concurrent application, @Nick's answer will make it much easier to find all lines that go with your stacktrace. Your solution will lose the application name and level name for all lines after the first, making searching difficult.Delius
@Delius You won't lose any information it just won't be on the same line.Allusion
By "lost" I meant, in this case, "unable to reconstruct". In a high volume application, splitting your log's meaning across multiple lines often makes it impossible to reassemble as the lines from multiple requests become interleaved.Delius
@Delius How could they become interleaved? I was toying with the idea of indenting every continuation line with 4 spaces. That way all lines starting with a non-whitespace character would be a log line, and every line starting with whitespace would be a continuation of the most recent log line. Additionally it would be very readable. But that wouldn't work if there's a chance records could become interleaved (?)Pismire
In general you are sending logs from multiple threads/processes/nodes to a smaller number of collectors. In such a case the collectors must interleave the logs somehow (readline, round robin? read until no more data, round robin?).Delius
Don't EVER do this if anyone else ever uses the system. This completely breaks the expectation that log entries in syslog are contained in a single line. Do this any any system I work on and I'll gather up every other admin for that system, we'll go get torches and pitchforks, and come by to tell you to revert the change. Every 10 minutes with an additional layer of management each time for all eternity until you do. You do not do something like this to critical infrastructure on systems other people use.Debauch
C
39

Alternatively, if you want to keep your syslog intact on one line for parsing, you can just replace the characters when viewing the log.

tail -f /var/log/syslog | sed 's/#012/\n\t/g'
Coyne answered 26/4, 2012 at 17:50 Comment(1)
sed does not work, get "blah blah line1 nt blah blah line2". However perl does work: tail -f /var/log/syslog | perl -pe 's/#012/\n\t/g;'Bibi
F
36

OK, figured it out finally...

rsyslog by default escapes all weird characters (ASCII < 32), and this include newlines (as well as tabs and others).

$EscapeControlCharactersOnReceive:

This directive instructs rsyslogd to replace control characters during reception of the message. The intent is to provide a way to stop non-printable messages from entering the syslog system as whole. If this option is turned on, all control-characters are converted to a 3-digit octal number and be prefixed with the $ControlCharacterEscapePrefix character (being ‘\’ by default). For example, if the BEL character (ctrl-g) is included in the message, it would be converted to “\007”.

You can simply add this to your rsyslog config to turn it off:

$EscapeControlCharactersOnReceive off

or, with the "new" advanced syntax:

global(parser.escapeControlCharactersOnReceive="off")
Faggoting answered 4/4, 2011 at 20:46 Comment(6)
This will probably work for your case, but if you're writing a high-volume concurrent application, @Nick's answer will make it much easier to find all lines that go with your stacktrace. Your solution will lose the application name and level name for all lines after the first, making searching difficult.Delius
@Delius You won't lose any information it just won't be on the same line.Allusion
By "lost" I meant, in this case, "unable to reconstruct". In a high volume application, splitting your log's meaning across multiple lines often makes it impossible to reassemble as the lines from multiple requests become interleaved.Delius
@Delius How could they become interleaved? I was toying with the idea of indenting every continuation line with 4 spaces. That way all lines starting with a non-whitespace character would be a log line, and every line starting with whitespace would be a continuation of the most recent log line. Additionally it would be very readable. But that wouldn't work if there's a chance records could become interleaved (?)Pismire
In general you are sending logs from multiple threads/processes/nodes to a smaller number of collectors. In such a case the collectors must interleave the logs somehow (readline, round robin? read until no more data, round robin?).Delius
Don't EVER do this if anyone else ever uses the system. This completely breaks the expectation that log entries in syslog are contained in a single line. Do this any any system I work on and I'll gather up every other admin for that system, we'll go get torches and pitchforks, and come by to tell you to revert the change. Every 10 minutes with an additional layer of management each time for all eternity until you do. You do not do something like this to critical infrastructure on systems other people use.Debauch
F
4

Another option would be to subclass the SysLogHandler and override emit() - you could then call the superclass emit() for each line in the text you're sent. Something like:

from logging import LogRecord
from logging.handlers import SysLogHandler

class MultilineSysLogHandler(SysLogHandler):
    def emit(self, record):
        if '\n' in record.msg:
            record_args = [record.args] if isinstance(record.args, dict) else record.args
            for single_line in record.msg.split('\n'):
                single_line_record = LogRecord(
                    name=record.name,
                    level=record.levelno,
                    pathname=record.pathname,
                    msg=single_line,
                    args=record_args,
                    exc_info=record.exc_info,
                    func=record.funcName
                )
                super(MultilineSysLogHandler, self).emit(single_line_record)
        else:
            super(MultilineSysLogHandler, self).emit(record)
Fionafionna answered 2/6, 2016 at 8:54 Comment(1)
level= should be levelno= and lineno=record.lineno is missing. However, if the message contains %s etc. after the \n, parameter substitution won't work correctly. This code would need to fully parameterize the message prior to splitting, but only if the record.levelno passes the filter.Troostite

© 2022 - 2024 — McMap. All rights reserved.