ElasticSearch - How to display an additional field name in aggregation query
Asked Answered
V

6

29

How can I add a new key called 'agency_name' in my output bucket.

I am running an aggregation code as shown below

{
  "aggs": {
    "name": {
      "terms": {
        "field": "agency_code"
      }
    }
  }
}

I will be getting the out put as

"aggregations": {
    "name": {
        "doc_count_error_upper_bound": 130,
        "sum_other_doc_count": 39921,
        "buckets": [
            {
                "key": "1000",
                "doc_count": 105163
            },
            {
                "key": "2100",
                "doc_count": 43006
            }
        ]
    }
}

While displaying I need to show the agency name, code and doc_count

How can I modify the aggregation query so that I could get the below format. I am new to ElasticSearch, not sure how to fix this

"aggregations": {
    "name": {
        "doc_count_error_upper_bound": 130,
        "sum_other_doc_count": 39921,
        "buckets": [
            {
                "key": "1000",
                "doc_count": 105163,
                "agency_name": 'Agent 1'
            },
            {
                "key": "2100",
                "doc_count": 43006,
                "agency_name": 'Agent 2'
            }
        ]
    }
}

Sample Data in ElasticSearch (fields are analysed)

{

    "_index": "feeds",
    "_type": "news",
    "_id": "22005",
    "_version": 1,
    "_score": 1,
    "_source": {
        "id": 22005,
        "name": "Test News",
        "agency_name": "Agent 1",
        "agency_code": "1000",
    }

}
Vaulted answered 30/7, 2015 at 10:50 Comment(0)
N
20

You can use the top hits aggregation like in the link below. The format will be slightly different since creating the extra aggregation will embed the agency name under another 'hits' key.

Adding additional fields to ElasticSearch terms aggregation

{
  "aggs": {
    "name": {
      "terms": {
        "field": "agency_code"
      },
      "aggs": {
        "agency_names" : {
           "top_hits": {
                size: 1, 
                _source: {
                    include: ['agency_name']
                }
            }
         } 
       }
    }
  }
}
Nynorsk answered 20/8, 2016 at 2:16 Comment(1)
This solution works great. Just one question, does it make any difference on performance if we use docvalue_fields instead of _source in the above query? I'm aware that they are 2 different things, just ask in term of performancePropensity
C
2

I think you would need to add another "aggs" to it. But it would not be in the format in which you want but as another field in the output , reason being currently you are aggregating based on "agency_code" and the doc_count shows how many times the particular agency code occurs. Now when you want to aggregate it based on "agency_name" the field might in different documents than "agency_code" and in different numbers as well , if they always exist in pair than this parent-child indexing might be of some help.

https://www.elastic.co/guide/en/elasticsearch/guide/current/indexing-parent-child.html

Conal answered 30/7, 2015 at 11:34 Comment(3)
looks like I don't need to aggregate based on agency_name. What I was planning is, any way I could add custom fields to an aggregation result. I have added a sample record to show how data is kept in elasticsearch.Vaulted
Hi @AmalKumarS May I know how did you solve this problem.Reviviscence
@Reviviscence I was not able to get the agency_name from aggregation result. What I did is I ran a seperate query to have agency code and agency name mapped. Which displaying the aggregated result, it used this mapping to display the agency name :(Vaulted
S
1

ES has no way of knowing agency_name and agency_code map one-to-one. Therefore I would recommend a number of possible strategies.

  • Don't analyze agency_name and use the term agg over that field. I would be surprised if you actually need to do tokenization of the agency_name.
  • Store the id to name mapping in a relational database or a flat file cache and do the join client side
  • Store the agency documents as another type and make two calls. The first to get the ids and then a second to lookup the agencies by id

As Aditya Patel mentioned above, parent child relationships may help out as well but I believe you will still have to use one of the above strategies to resolve the id->name mapping.

Secondguess answered 31/7, 2015 at 13:7 Comment(1)
I stored the id to name mapping in a relational database and did the join on client side as fixVaulted
M
1

This is old post, however, I ran into same issue and I followed what is given at https://www.elastic.co/guide/en/elasticsearch/reference/current/agg-metadata.html. Add metadata details and it is return as part of the result above bucket. Hope it will help someone in future.

Mouthpart answered 15/9, 2017 at 7:17 Comment(1)
this is not what he askedPyramid
H
0

What I do is use something like the following query:

"aggs" : {
    "products" : {
      "filter" : { "term": { "item.category": "children" }},
      "aggs" : {
        "count" : {
          "terms" : {
            "script": "doc['item.id'].value + ':' + doc['item.name'].value"
          }
        }
      }
    }
  }

Which returns something like this:

...
"aggregations" : {
    "products" : {
      "doc_count" : 1050,
      "count" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [
          {
            "key" : "x2_90QBj9k:Baby Oil",
            "doc_count" : 45
          },
          ...
        ]
...

And then I can use a string operation on bucket[i]["key"], for each i in a loop, to extract the relevant field.

Herculie answered 2/5, 2020 at 7:32 Comment(0)
B
0

This might be a bit old, but from ES 7.12 onwards, we have multi_terms aggregation. Your query should look something like this.

{
  "aggs": {
    "group_by_agency": {
      "multi_terms": {
      "terms": [
        {"field": "agency_code"},
        {"field": "agency_name"}
      ]
    }
  }
}
Bibliotherapy answered 23/7, 2024 at 10:50 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.