Logstash grok filter help - fixed position file
Asked Answered
N

1

7

I have a fixed position (column) file, where there is no delimiter which separates the fields. Each field has its own start position and length. Here is the example of the data:

520140914191193386---------7661705508623855646---1595852965---133437--the lazy fox jumping over-----------------------212.75.12.85---

While I used dashes (-) to show the sample of the data above, the actual file contains spaces if the actual field is shorter than allowed in schema.

The schema in this case is:

UsedID (start position 1, length 27)
SystemID (start position 28, length 22)
SampleID (start position 50, length 13)
LineID (start position 63, length 8)
Text (start position 71, length 48)
IP (start position119, length 15)

Ideally, I would get the following field values in logstash (without trailing spaces)

UsedID:520140914191193386
SystemID:7661705508623855646
SampleID:1595852965
LineID:133437
Text:the lazy fox jumping over
IP:212.75.12.85

How do I parse this kind of file with grok?

Northcutt answered 14/9, 2014 at 20:37 Comment(0)
H
14

I'd go for a two-step process:

  • Split data into fields
  • Strip empty data from end of each field

Since each field has a known length, you can use a regex pattern like .{27} to match them.

In grok, you can name a field like so: (?<user_id>.{27})

You can test a full pattern in the grok debugger, but something like this should achieve a length-based split:

(?<user_id>.{27})(?<system_id>.{22})(?<sample_id>.{13})(?<line_id>.{8})(?<text>.{48})(?<ip>.{15})

You mentioned that your extra characters are all whitespace, so you can clean that up using the mutate filter with a strip option.

All together, that might look something like this:

filter {
    grok {
        match => ["message", "(?<user_id>.{27})(?<system_id>.{22})(?<sample_id>.{13})(?<line_id>.{8})(?<text>.{48})(?<ip>.{15})"]
    }

    mutate {
        strip => [
            "user_id",
            "system_id",
            "sample_id",
            "line_id",
            "text",
            "ip"
        ]
    }
}
Hendley answered 14/9, 2014 at 22:57 Comment(1)
Hi, provided link to Grok debugger is broken. Kibana offers also a Grok debugger for Logstash: elastic.co/guide/en/kibana/8.10/xpack-grokdebugger.htmlJammie

© 2022 - 2024 — McMap. All rights reserved.