Docker Fluentd Logging Driver For multiline

Asked 21/9, 2015 at 2:30 Answered 19/9, 2017 at 10:44

I am trying to create a centralized logging system using fluentd for a docker environment. Currently, i able to send the docker log to fluentd using fluentd docker logging driver which is a much cleaner solution compare to reading the docker log file using in_tail method. However, i am currently facing the issue on multi lines log issue.

As you can see from the picture above, the multi lines log are out of order which is very confusing for user. Is there any way this can be solved?

Thanks.

Prim answered 21/9, 2015 at 2:30 Comment(3)

Just to add some comments on this topic after i did some further research. The out of order issue is due to Fluentd time resolution (no sub second support now). Thanks to this answer link, i able to get the records display in order and at least user will not be that confuse when reading this log. – Prim 21/9, 2015 at 8:17

For another solution to the milisecond issue, check this blog post work.haufegroup.io/log-aggregation/#timestamp-fix – Pavo 14/7, 2017 at 5:36

Do you have a solution yet? I found this link fluentd.org/guides/recipes/docker-logging about merge multiline log in docker before it send to fluentd, but the implementation is very specific to the log format. – Ames 18/9, 2017 at 18:47

Using fluent-plugin-concat pluging helped me in fixing above problem.

Adding these lines in fluent-conf

 <filter **>
  @type concat
  key log
  stream_identity_key container_id
  multiline_start_regexp /^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{3}
  multiline_end_regexp /^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{3}
</filter>

Where my regular expression is checking for DateTimeStamp in Logs where each line starts with and Date and Timestamp (pay attention to "log":"2017-09-21 15:03:27.289) below

2017-09-21T15:03:27Z    tag     {"container_id":"11b0d89723b9c812be65233adbc51a71507bee04e494134258b7af13f089087f","container_name":"/bel_osc.1.bc1k2z6lke1d7djeq5s28xjyl","source":"stdout","log":"2017-09-21 15:03:27.289  INFO 1 --- [           main] org.apache.catalina.core.StandardEngine  : Starting Servlet Engine: Apache Tomcat/8.5.6"}
2017-09-21T15:03:28Z    tag     {"container_id":"11b0d89723b9c812be65233adbc51a71507bee04e494134258b7af13f089087f","container_name":"/bel_osc.1.bc1k2z6lke1d7djeq5s28xjyl","source":"stdout","log":"2017-09-21 15:03:28.191  INFO 1 --- [ost-startStop-1] o.a.c.c.C.[Tomcat].[localhost].[/]       : Initializing Spring embedded WebApplicationContext"}

Also, I had to add below lines in Dockerfile to install the plugin

RUN ["gem", "install", "fluent-plugin-concat", "--version", "2.1.0"] 
#Works with Fluentd v0.14-debian

Though this regular expression doesn't work well when an exception occurs, but still much better than before. Fluentd Link, for reference.

Misquotation answered 19/9, 2017 at 10:44 Comment(0)

Take a look at multiline parsing in their documentation: http://docs.fluentd.org/articles/parser-plugin-overview#

You basically have to specify a regex that would match the beginning of a new log message and that will enable fluentd to aggregate multiline log events into a single message.

Example for a usual java stacktrace from their docs:

format multiline format_firstline /\d{4}-\d{1,2}-\d{1,2}/ format1 /^(?<time>\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}) \[(?<thread>.*)\] (?<level>[^\s]+)(?<message>.*)/

Pavo answered 10/11, 2015 at 12:43 Comment(5)

According to fluentd's docs "multiline works with only in_tail plugin." which means that when you are using the @type forward for input from docker this won't work. – Cigarillo 25/11, 2015 at 11:35

@AshBerlin - You can also use it in multiple plugins, the core plugins that support parsing are described here docs.fluentd.org/articles/… – Pavo 25/11, 2015 at 14:50

@AshBerlin - Also, perhaps it is possible to replace the in_forward plugin with in_tcp, they are basically the same thing only in_forward also listens on UDP. And in_tcp is one of the plugins that support format parsers out-of-the-box – Pavo 25/11, 2015 at 15:59

Ah I'll give that a go. Knowing that might also help us deal with the case where we've got our containers producing JSON which gets put as a string in the "log" fields by docker – Cigarillo 25/11, 2015 at 19:4

@AshBerlin I haven't gone that route but I would suggest you also look into the fluent-plugin-parser. You can just pass along all the events coming from the docker instance and then try to multiline parse them in the filter before pushing them out – Pavo 26/11, 2015 at 7:28

I know this is not and "answer" to the fluentd question. But this guide solves the problem with logstash: http://www.labouisse.com/how-to/2015/09/14/elk-and-docker-1-8

JSON support by adding

    json {
        source => "log_message"
        target => "json"
    }

to his filter after parsing a log line

I never found a solution for fluentd, so went with this solution instead

Updated link

Penultimate answered 9/2, 2016 at 9:28 Comment(1)

THe link is dead now. Could you explain in more detail? – Des 18/1, 2018 at 12:29

Recommended topics

Hot tags