Insert aggregation results into an index
Asked Answered
D

1

10

The goal is to build an Elasticsearch index with only the most recent documents in groups of related documents to track the current state of some monitoring counters and states.

I have crafted a simple Elasticsearch aggregation query:

{
  "size": 0,
  "aggs": {
    "group_by_monitor": {
      "terms": {
        "field": "monitor_name"
      },
      "aggs": {
        "get_latest": {
          "top_hits": {
            "size": 1,
            "sort": [
              {
                "timestamp": {
                  "order": "desc"
                }
              }
            ]
          }
        }
      }
    }
  }
}

It groups related documents into buckets and select the most recent document for each bucket.

Here are the different ideas I had to get the job done:

  1. directly use the aggregation query to push the results into the index, but it does not seem possible : Is it possible to put the results of an ElasticSearch aggregation back into the index?
  2. use the Logstash Elasticsearch input plugin to execute the aggregation query and the Elasticsearch output plugin to push into the index, but seems like the input plugin only looks at the hits field and is unable to handle aggregation results: Aggregation Query possible input ES plugin !
  3. use the Logstash http_poller plugin to get a JSON document, but it does not seem to allow specifying a body for the HTTP request !
  4. use the Logstash exec plugin to execute cURL commands to get the JSON but this seems quite cumbersome and my last resort.
  5. use the NEST API to build a basic application that will do polling, extract results, clean them and inject the resulting documents into the target index, but I'd like to avoid adding a new tool to maintain.

Is there a reasonably complex way of accomplishing this?

Dare answered 8/4, 2016 at 17:10 Comment(6)
Watcher?Drizzle
@AndreiStefan Thanks but AFAIK Watcher won't help for this use case. Moreover we don't have it (yet?) deployed on our infrastructure. For alerting we use ElastAlert which does the job perfectly too.Dare
I'm not suggesting Watcher for alerting, but for being able to query the indices at a regular interval, do some basic transformation on the resulted data and be able to index back into Elasticsearch.Drizzle
@AndreiStefan Thanks for these elements. Indeed Watcher seems a good alternative. But as said before we don't have it yet. :'(Dare
Hey. I have exact the same issue.Did you find a way to directly use the aggregation query to push the results into the index or an work around? ThanksGhost
@OvidiuRudi Not a direct way, I had to build a dedicated program in C# to make the plumbing.Dare
S
3

Edit the logstash.conf file as follow

input {
  elasticsearch {
    hosts => "localhost" 
    index => "source_index_name" 
    type =>"index_type" 
    query => '{Query}' 
    size => 500 
    scroll => "5m" 
    docinfo => true
  }
}

output { 
  elasticsearch { 
    index => "target_index_name" 
    document_id => "%{[@metadata][_id]}"
  }
}
Squeeze answered 6/2, 2017 at 10:45 Comment(6)
Is it working now thanks to a fix of Logstash? Because at the time of the question Logstash was not handling aggregations.Dare
yup its working I tried it yesterday only on ELK(5.1.1)Squeeze
OK, I trust you, you get my +1. :)Dare
i don't understand your answer. i've tested it on elastic search 5.1.1 with logstash 5.6.1 and doesn't work. where are you telling logstash to use aggregation result instead of 'hits' array?Stoneblind
For this, I have used Elaticsearc 5.5Squeeze
@AkshayPatil is not a problem about elastic search version. as you can see from source code link logstash simply scroll the hits array, so it is impossibile to read the aggreggation from elastic search, that are placed in the aggregations array in the response of the query. the configuration that you have posted simply copy each documents complain the query from source_index_name to target_index_name, so it ignores completely the aggregations values.Stoneblind

© 2022 - 2024 — McMap. All rights reserved.