elastic range query giving only top 10 records
Asked Answered
A

1

10

I am using elastic search query range to get the records from one date to other date using python, but I am only getting 10 records.

Below is the query

{"query": {"range": {"date": {"gte":"2022-01-01 01:00:00", "lte":"2022-10-10 01:00:00"}}}}

Sample Output:

{
    "took": 12,
    "timed_out": false,
    "_shards": {
        "total": 8,
        "successful": 8,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 10000,
            "relation": "gte"
        },
        "max_score": 1.0,
        "hits": [{"_source": {}}]
}

The "hits" list consists of only 10 records. When I checked in my database there are more than 10 records.

Can anyone tell me how to modify the query to get all the records for the mentioned date ranges?

Analogue answered 26/9, 2022 at 6:16 Comment(0)
T
13

You need to use the size param as by default Elasticsearch returns only 10 results.

{
    "size" : 100, // if you want to fetch return 100 results.
    "query": {
        "range": {
            "date": {
                "gte": "2022-01-01 01:00:00",
                "lte": "2022-10-10 01:00:00"
            }
        }
    }
}

Refer Elasticsearch documentation for more info.

Update-1: Since OP wants to know the exact count of records matching search query(refer comments section), one use &track_total_hits=true in the URL (hint: causes performance issues) and then in the search response under hits you will get exact records matching your search as shown.

POST /_search?track_total_hits=true

"hits": {
        "total": {
            "value": 24981859, // note count with relation equal.
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    }

Update-2, OP mentioned in the comments, that he is getting only 10K records in one query, as mentioned in the chat, its restericted by Elasticsearch due to performance reasons but if you still want to change it, it can be done by changing below setting of an index.

index.max_result_window: 20000 // your desired count

However as suggested in documentation

index.max_result_window The maximum value of from + size for searches to this index. Defaults to 10000. Search requests take heap memory and time proportional to from + size and this limits that memory. See Scroll or Search After for a more efficient alternative to raising this.

Threedimensional answered 26/9, 2022 at 6:22 Comment(6)
How do we get to know regarding the size? the records keep updating every week. If I give the size as 10k then there might be records more than that rightAnalogue
@merkle, yes if you just want to know no of records matching your search query, than you need to use &track_total_hits=true in the URL (hint: causes performance issues) and than in the search response under hits you will get exact records matching your search.Threedimensional
can you add this parameter to the query? I am not getting on where to add thisAnalogue
@Analogue updated answer POST /_search?track_total_hits=trueThreedimensional
I am only getting 10000 runs. Do I want to add the track_total_hits in the range query?Analogue
Let us continue this discussion in chat.Threedimensional

© 2022 - 2024 — McMap. All rights reserved.