Check if Elasticsearch has finished indexing
Asked Answered
N

2

11

Is there a way to check if Elasticsearch has finished processing my request?
I want to perform integration tests for my application checking if a record can be found after insertion. For example if I make a following request:

POST /_all/_bulk
{  
   "update":{  
      "_id":419,
      "_index":"popc",
      "_type":"offers"
   }
}
{  
   "doc":{  
      "id":"419",
      "author":"foo bar",
      "number":"642-00419"
   },
   "doc_as_upsert":true
}

And I check immediately, the test fails, because it takes some time for Elasticsearch to complete my request.
If I sleep for 1 second before the assertion it works most of the time, but not always.
I could extend the sleep time to eg. 3 seconds, but it makes the tests very slow, hence my question.

I have tried using cat pending tasks and pending cluster tasks endpoints, but the responses are always empty.

If any of this is relevant, I'm using Elasticsearch 5.4, Laravel Scout 3.0.5 and tamayo/laravel-scout-elastic 3.0.3

Niobe answered 29/8, 2017 at 10:3 Comment(0)
G
2

You can wait for the response; when you receive the response to the update request, it's done (and you won't see it in pending or current tasks). I think the problem you're having is probably with the refresh interval (see dynamic settings). Indexed documents are not available for search right away, and this is the (maximum) amount of time before they will be available. (You can change this setting for what makes sense for your use case, or use this setting to let you know how long you need to sleep before searching for the integration tests.)

If you want to see at in-progress tasks, you can use the tasks api.

Girhiny answered 29/8, 2017 at 14:8 Comment(1)
Thanks, I have set the index.refresh_interval to 1ms and for 1000 test runs sleeping for 600ms is always enough.Niobe
A
7

I found this PR: https://github.com/elastic/elasticsearch/pull/17986

You can use refresh: wait_for and Elasticsearch will only respond once your data is available for search.

Aixenprovence answered 9/1, 2018 at 22:40 Comment(2)
There's an official doc for this as well: elastic.co/guide/en/elasticsearch/reference/current/…Cyclopedia
This properly solves the problem. I found it saves a lot of time to also set the refresh_interval to a low number for testing as @dshockley recommended.Chemush
G
2

You can wait for the response; when you receive the response to the update request, it's done (and you won't see it in pending or current tasks). I think the problem you're having is probably with the refresh interval (see dynamic settings). Indexed documents are not available for search right away, and this is the (maximum) amount of time before they will be available. (You can change this setting for what makes sense for your use case, or use this setting to let you know how long you need to sleep before searching for the integration tests.)

If you want to see at in-progress tasks, you can use the tasks api.

Girhiny answered 29/8, 2017 at 14:8 Comment(1)
Thanks, I have set the index.refresh_interval to 1ms and for 1000 test runs sleeping for 600ms is always enough.Niobe

© 2022 - 2024 — McMap. All rights reserved.