Limit and Offset in Term Aggregation ElasticSearch
Asked Answered
S

3

6

There is way to get the top n terms result. For example:

{
  "aggs": {
    "apiSalesRepUser": {
      "terms": {
        "field": "userName",
        "size": 5
      }
    }
  }
}

Is there any way to set the offset for the terms result?

Stonemason answered 2/4, 2015 at 12:11 Comment(2)
elastic.co/guide/en/elasticsearch/reference/1.4/… maybe using from ? ( elastic.co/guide/en/elasticsearch/reference/current/… inside an aggregation )Haplology
@Haplology That does not apply to aggregations. It only applies to the hits returned.Bledsoe
B
3

If you mean something like ignore first m results and return the next n results then no; it is not possible. A workaround to that would be to set size to m + n and do client side processing to ignore the first m results.

Bledsoe answered 2/4, 2015 at 13:39 Comment(0)
S
2

A little late, but (at least) since Elastic 5.2.0 you can use partitioning in the terms aggregation to paginate results.

https://www.elastic.co/guide/en/elasticsearch/reference/5.2/search-aggregations-bucket-terms-aggregation.html#_filtering_values_with_partitions

Shagbark answered 16/1, 2020 at 6:32 Comment(1)
Partitioning will break sorting, if you want to have it. For example, if you're sorting sold items by count for each item, a more popular item may be in bucket 2 when you're fetching bucket 1 - you will have to fetch all buckets and then sort in your application. If you don't care about such things, partitioning is great!Ophthalmology
S
2

Maybe this helps a bit:

"aggregations": {
    "apiSalesRepUser": {
      "terms": {
        "field": "userName",
        "size": 9999 ---> add here a bigger size 
      }
    },
  "aggregations": {
    "limitBucket": {
      "bucket_sort": {
        "sort": [],
        "from": 10,
        "size": 20,
        "gap_policy": "SKIP"
      }
    }
  }
}

I am not sure about what value to put in the term size. I would suggest to put a reasonable value. This limits the initial aggregation, then the second limitBucket agg will limit again the term agg. This will probably still load in memory all the documents that you limited in the terms agg. That is why it depends on your scenario, if it's reasonable not get all results (i.e. if you have tens of thousands). I.e you are doing a google like search where you don't need to jump to page 1000.

Compared to the alternative to get the data on the client side, this might save you some data transfer from ES, but as I said weight this carefully as it loads all a lot of data in ES memory and you might have memory issues in ElasticSearch

Sarazen answered 2/4, 2020 at 14:32 Comment(1)
This answer helped me, because the network transfer sizes for our queries were getting ridiculous, when there was paging on the client side.Ophthalmology

© 2022 - 2024 — McMap. All rights reserved.