Elasticsearch: delete by query is really slow on a lot of documents to delete
Asked Answered
D

2

8

i'm using delete by query plugins for elastic search.

I have a index products with a integer field size. I want delete all document with size 10. I have over 5000 documents with size 10. If i try:

DELETE /products/product/_query?q=size:10

this query requires over 2 minutes.

I understand because delete by query plugin is slow, from documentation:

Internally, it uses Scroll and Bulk APIs to delete documents in an efficient and safe manner. It is slower [..] Queries which match large numbers of documents may run for a long time, as every document has to be deleted individually.

How do i perform a fastest documents mass deleting?

Disequilibrium answered 8/9, 2016 at 7:18 Comment(3)
You can't. This is the only supported way of deleting documents in latest versions of Elasticsearch. Elasticsearch 1.x deletes much faster (but potentially in an unsafe manner). So if it is really worth so much, you can go back to an older version of Elasticsearch.Luteal
ok, thanks! i think this is the answer for the question, not a comment...Disequilibrium
Posted it as answer.Luteal
L
6

You can't. This is the only supported way of deleting documents in latest versions of Elasticsearch. Elasticsearch 1.x deletes much faster (but potentially in an unsafe manner). So if it is really worth so much, you can go back to an older version of Elasticsearch.

Luteal answered 8/9, 2016 at 10:31 Comment(0)
M
0

ES 8.11, 2024-01

I don't know what the situation was in 2016, but maybe you could consider doing a bulk delete.

The downside of this is that it might be quite complicated to determine the _ids of all the LuceneDocuments (index documents) you need to delete. Typically you might have to run a _search query to find these _ids on the basis of your query. You must have these _ids to do a bulk delete.

Then you have the faff of making a bulk string conforming to the strict string format required. It's fairly feasible when you get the hang of it. And these bulk operations are pretty fast.

Miry answered 11/1 at 19:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.