Using Logstash, my goal is to index the document if the timestamp for that document hasn't been indexed before, otherwise, if the document does exist and the timestamp is not in the array, then append the timestamp array. My problem is that an array is appended to an array.
i.e. my input log line is always the same EXCEPT for the timestamp which I want to append to the same document in Elastic.
Here is my input data.
- Notice that timestamp is a string.
The "hash" field will become the document id (for example only)
{"timestamp":"1534023333", "hash":"1"} {"timestamp":"1534022222", "hash":"1"} {"timestamp":"1534011111", "hash":"1"}
Here is my Logstash config:
- The timestamp field is split which turns it into an array.
- The first time the document is seen, it is indexed. The next time it is seen, the script runs.
- The script looks to see if the timestamp value is present and if not, append.
params.event.get is used because it prevents a dynamic script compilation
input { file { path => "timestamp.json" start_position => "beginning" codec => "json" } } filter { mutate { split => { "timestamp" => "," } } } output { elasticsearch { hosts => ["http://127.0.0.1:9200"] index => "test1" document_id => "%{[hash]}" doc_as_upsert => true script => 'if(ctx._source.timestamp.contains(params.event.get("timestamp"))) return true; else (ctx._source.timestamp.add(params.event.get("timestamp")))' action => "update" retry_on_conflict=>3 } #stdout { codec => rubydebug } }
Here is the output.
Notice that timestamp is an array. But each value is appeded to the array as an array.
"timestamp": [ "1534011111", [ "1534022222" ], [ "1534023333" ] ],
What I desire is the output to be:
"timestamp": [
"1534011111",
"1534022222"
"1534023333"
],
How do I get the desired output? I'm running Elasticsearch 6.4.2 and Logstash 6.4.2.