The goal is to build an Elasticsearch index with only the most recent documents in groups of related documents to track the current state of some monitoring counters and states.
I have crafted a simple Elasticsearch aggregation query:
{
"size": 0,
"aggs": {
"group_by_monitor": {
"terms": {
"field": "monitor_name"
},
"aggs": {
"get_latest": {
"top_hits": {
"size": 1,
"sort": [
{
"timestamp": {
"order": "desc"
}
}
]
}
}
}
}
}
}
It groups related documents into buckets and select the most recent document for each bucket.
Here are the different ideas I had to get the job done:
- directly use the aggregation query to push the results into the index, but it does not seem possible : Is it possible to put the results of an ElasticSearch aggregation back into the index?
- use the Logstash Elasticsearch input plugin to execute the aggregation query and the Elasticsearch output plugin to push into the index, but seems like the input plugin only looks at the
hits
field and is unable to handle aggregation results: Aggregation Query possible input ES plugin ! - use the Logstash http_poller plugin to get a JSON document, but it does not seem to allow specifying a body for the HTTP request !
- use the Logstash exec plugin to execute cURL commands to get the JSON but this seems quite cumbersome and my last resort.
- use the NEST API to build a basic application that will do polling, extract results, clean them and inject the resulting documents into the target index, but I'd like to avoid adding a new tool to maintain.
Is there a reasonably complex way of accomplishing this?