Elasticsearch delete_by_query version conflict
Asked Answered
B

4

9

According to ES documentation document indexing/deletion happens as follows:

  1. Request received at one of the nodes.
  2. Request forwarded to the document's primary shard.
  3. The operation performed on the primary shard and parallel requests sent to replica nodes.
  4. Primary shard node waits for a response from replica nodes and then send the response to the node where the request was originally received.
  5. Send the response back to the client.

Now in my case, I am sending a create document request to ES at time t and then sending a request to delete the same document (using delete_by_query) at approximately t+800 milliseconds. These requests are sent via a messaging system (internal implementation of kafka) which ensures that the delete request will be sent to ES only after receiving 200 OK response for the indexing operation from ES.

According to ES documentation, delete_by_query throws a 409 version conflict only when the documents present in the delete query have been updated during the time delete_by_query was still executing.

In my case, it is always guaranteed that the delete_by_query request will be sent to ES only when a 200 OK response has been received for all the documents that have to be deleted. Hence there is no possibility of an update/create of a document that has to be deleted during delete_by_query operation.

Please let me know if I am missing something or this is an issue with ES.

Bianco answered 27/3, 2019 at 16:58 Comment(2)
May I ask you what is the problem? Does ES return you an error when it should not, or the other way around? I can't figure it out from the description.Archbishopric
ES is returning a version conflict for _delete_by_query when it should not.Bianco
T
7

Possible reason could be due to the fact that when a document is created, it is not "committed" to the index immediately.

Elasticsearch indices operate on a refresh_interval, which defaults to 1 second.

This documentation around refresh cycles is old, but I cannot for the life of me find anything as descriptive in the more modern ES versions.

A few things you can try:

  1. Send _refresh with your request
  2. Add ?refresh=wait_for or ?refresh=true param

Note that refreshing the index on every indexing request is terrible for performance, which begs the question as to why you are trying to delete a document immediately after indexing it.

Tobit answered 27/3, 2019 at 20:44 Comment(1)
in delete_by_query, there is also a refresh parameter, but this is done after the delete_by_query operation, according to the doc: > If true, Elasticsearch refreshes all shards involved in the delete by query after the request completes.Lycaonia
M
4

add

deleteByQueryRequest.setAbortOnVersionConflict(false);
Mannikin answered 30/7, 2019 at 10:55 Comment(1)
I am not sure if this is a solution. If you ignore the version conflict then you may be deleting an old version of the document so the newest version will still be there. This will make your operation "succeed" but not doing what is intended.Currish
S
1

I've added the query param conflicts=proceed. Documentation

POST http://www.my-elasticsearch-host.com/objects/_delete_by_query?conflicts=proceed
Sirajuddaula answered 13/10, 2023 at 12:53 Comment(2)
but this does not delete the document, it just skip the document where the conflict occurs.Lycaonia
correct. but in my case it was enough in that time. If you need to delete the document you need to be sure that the version of the document wasn't changed. In general - you wouldn't see the conflict, if the version wasn't changedSirajuddaula
L
0

I have exactly the same error as you. As @ryanlutgen mentioned, after indexing the document, they are actually not immediately avaiable for search. You have to wait for Elasticsearch to refresh the indexes for the new version of document to be available (see here for more information). Otherwise, if you do, for example, a delete_by_query API call, you may still get the old version of the document, which is very tricky to find the error!

The below script can reproduce the error (source here)

from elasticsearch import Elasticsearch

es_client = Elasticsearch(...)

index_name = "demo_index"
_id = "123456"
doc_old = {"title": "old title"}
doc_new = {"title": "new title"}

if es_client.indices.exists(index=index_name):
    es_client.indices.delete(index=index_name)

es_client.index(index=index_name, document=doc_old, id=_id)
es_client.indices.refresh(index=index_name)

search_response = es_client.search(index=index_name, seq_no_primary_term=True)
print(f"first search response: {search_response}")

# index updated doc
es_client.index(index=index_name, document=doc_new, id=_id)

# this refresh below is very important
# es_client.indices.refresh(index=index_name)

second_response = es_client.search(index=index_name, seq_no_primary_term=True)
print(f"second search response: {second_response}")

deletion_result = es_client.delete_by_query(
    index=index_name, query={"bool": {"should": [{"term": {"_id": _id}}]}}
)
print(f"deletion result: {deletion_result}")

After update the document with new content doc_new, the second search response actually still return the old document content. The delete_by_query call triggers exceptions like this:

elasticsearch.ConflictError: ConflictError(409, "{'took': 2, 'timed_out': False, 'total': 1, 'deleted': 0, 'batches': 1, 'version_conflicts': 1, 'noops': 0, 'retries': {'bulk': 0, 'search': 0}, 'throttled_millis': 0, 'requests_per_second': -1.0, 'throttled_until_millis': 0, 'failures': [{'index': 'demo_index', 'id': '123456', 'cause': {'type': 'version_conflict_engine_exception', 'reason': '[123456]: version conflict, required seqNo [0], primary term [1]. current document has seqNo [1] and primary term [1]', 'index_uuid': 'vTyLMxLNTMG5RqtkXtYCMg', 'shard': '0', 'index': 'demo_index'}, 'status': 409}]}")

To fix the issue, you can either call the _refresh API manually. Or You can tweak the refresh parameters in your indexing calls, which is explained in more detail in offical doc about refresh.

Lycaonia answered 12/9 at 9:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.