Selective parsing of csv file using logstash
Asked Answered
G

2

6

I am trying to feed data into elasticsearch from csv files, through logstash. These csv files contain the first row as the column names. Is there any particular way to skip that row while parsing the file? Are there any conditionals/filters that I could use such that in case of exception it would skip to the next row??

my config file looks like:

input {  
      file {
          path => "/home/sagnik/work/logstash-1.4.2/bin/promosms_dec15.csv"
          type => "promosms_dec15"
          start_position => "beginning"
          sincedb_path => "/dev/null"
      }
}
filter {

    csv {
        columns => ["Comm_Plan","Queue_Booking","Order_Reference","Generation_Date"]
        separator => ","
    }  
    ruby {
          code => "event['Generation_Date'] = Date.parse(event['Generation_Date']);"
    }

}
output {  
    elasticsearch { 
        action => "index"
        host => "localhost"
        index => "promosms-%{+dd.MM.YYYY}"
        workers => 1
    }
}

The first few rows of my csv file looks like

"Comm_Plan","Queue_Booking","Order_Reference","Generation_Date"
"","No","FMN1191MVHV","31/03/2014"
"","No","FMN1191N64G","31/03/2014"
"","No","FMN1192OPMY","31/03/2014"

Is there anyway I could skip the first line? Also, if my csv file ends with a new line, with nothing in it, then also I get an error. How do I skip those new lines if they come at the end of the file or if thre is an empty row between 2 rows?

Gereron answered 17/12, 2014 at 7:3 Comment(0)
P
12

A simple way to do it would be to add the following to your filter (after csv, before ruby):

if [Comm_Plan] == "Comm_Plan" {
  drop { }
}

Assuming the field would never normally have the same value as the column heading, it should work as expected, however, you could be more specific by using:

if [Comm_Plan] == "Comm_Plan" and [Queue_Booking] == "Queue_Booking" and [Order_Reference] == "Order_Reference" and [Generation_Date] == "Generation_Date" {
  drop { }
}

All this would do would be to check to see if the field value had that particular value and if it did, drop the event.

Pediculosis answered 17/12, 2014 at 11:4 Comment(2)
thanks for that. Could you also tell me how to skip empty lines? For instance, if my csv file ends with a newline or if there are blank rows between 2 rows, how to skip them?Gereron
Hi, that's not something I have looked in to previously, I recommend you open a new question about itPediculosis
J
0

try this:

  mutate {
      gsub => ["message","\r\n",""]
  }
  mutate {
      gsub => ["message","\r",""]
  }
  mutate {
      gsub => ["message","\n",""]
  }
  if ![message] {
      drop { }
  }
Junction answered 15/10, 2020 at 9:21 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.