Why is an array adding to an array with painless script?
Asked Answered
B

1

6

Using Logstash, my goal is to index the document if the timestamp for that document hasn't been indexed before, otherwise, if the document does exist and the timestamp is not in the array, then append the timestamp array. My problem is that an array is appended to an array.

i.e. my input log line is always the same EXCEPT for the timestamp which I want to append to the same document in Elastic.

Here is my input data.

  • Notice that timestamp is a string.
  • The "hash" field will become the document id (for example only)

    {"timestamp":"1534023333", "hash":"1"}
    {"timestamp":"1534022222", "hash":"1"}
    {"timestamp":"1534011111", "hash":"1"}
    

Here is my Logstash config:

  • The timestamp field is split which turns it into an array.
  • The first time the document is seen, it is indexed. The next time it is seen, the script runs.
  • The script looks to see if the timestamp value is present and if not, append.
  • params.event.get is used because it prevents a dynamic script compilation

    input {
      file {
        path => "timestamp.json"
        start_position => "beginning"
        codec => "json"
      }
    }
    
    filter {
        mutate {
            split => { "timestamp" => "," }
        }
    }
    
    output {
      elasticsearch {
        hosts => ["http://127.0.0.1:9200"]
        index => "test1"
        document_id => "%{[hash]}"
        doc_as_upsert => true
        script =>     'if(ctx._source.timestamp.contains(params.event.get("timestamp"))) return true; else (ctx._source.timestamp.add(params.event.get("timestamp")))'
        action => "update"
        retry_on_conflict=>3
    
      }
      #stdout { codec => rubydebug }
    }
    

Here is the output.

  • Notice that timestamp is an array. But each value is appeded to the array as an array.

     "timestamp": [
          "1534011111",
          [
            "1534022222"
          ],
          [
            "1534023333"
          ]
        ],
    

What I desire is the output to be:

 "timestamp": [
      "1534011111",
      "1534022222"
      "1534023333"
    ],

How do I get the desired output? I'm running Elasticsearch 6.4.2 and Logstash 6.4.2.

Barnwell answered 4/11, 2018 at 13:25 Comment(0)
V
2

The problem is that split => { "timestamp" => "," } transforms timestamp field into an array and add method takes an object and it appends to the original array (it does not concatenate two arrays).

In painless try to access the first element of timestamp array just like this: if(ctx._source.timestamp.contains(params.event.get("timestamp")[0])) return true; else (ctx._source.timestamp.add(params.event.get("timestamp")[0]))

Veii answered 5/11, 2018 at 14:0 Comment(1)
So that is how you access the item in the array! I updated my Logstash config and verified that your answer does provide the desired output. Thanks!Barnwell

© 2022 - 2024 — McMap. All rights reserved.