Deleting solr documents from Solr Admin
Asked Answered
K

9

92

How do I delete all the documents in my SOLR index using the SOLR Admin.

I tried using the url and it works but want to know if the same can be done using the Admin..

Keene answered 22/4, 2014 at 19:44 Comment(0)
N
204

Use one of the queries below in the Document tab of Solr Admin UI:

XML:

<delete><query>*:*</query></delete>

JSON:

{'delete': {'query': '*:*'}}

Make sure to select the Document Type drop down to Solr Command (raw XML or JSON).

Ney answered 28/12, 2017 at 12:4 Comment(7)
This should be the accepted answer, as @GuySchalnat's answer does not work when stream.body has been disabled (which seems to be the case with recent versions of Solr). This answer is also more straightforward.Freehanded
not to mention this works without having to deal with coresFarceuse
OP asked how to delete all documents, so the XML command should be <delete><query>*:*</query></delete> Also, don't forget to write /update as the request handler.Wicker
Don't forget to commit.Antiscorbutic
I updated the question with respect to @vmaldosan's commentFarceuse
This only worked for me, after I additionally posted "<commit />" send with the same settings, like user3754136 suggested via curl.Nuke
if you want to delete only a range of date : <delete><query>date:[2000-01-01T00:00:00Z TO 2018-07-29T00:00:00Z]</query></delete>Exclosure
E
62

Update: newer versions of Solr may work better with this answer: https://mcmap.net/q/234984/-deleting-solr-documents-from-solr-admin

My original answer is below:


I'm cheating a little, but not as much as writing the query by hand.

Since I've experienced the pain of accidental deletions before, I try to foolproof my deletions as much as possible (in any kind of data store).

1) Run a query in the Solr Admin Query screen, by only using the "q" parameter at the top left. Narrow it to the items you actually want to delete. For this example, I'm using *:*, but you can use things like id:abcdef or a range or whatever. If you have a crazy complex query, you may find it easier to do this multiple times, once for each part of the data you wish to delete.

2) On top of the results, there is a grayed out URL. If you hover the mouse over it, it turns black. This is the URL that was used to get the results. Right (context) click on it and open it in a new tab/window. You should get something like:

http://localhost:8983/solr/my_core_name/select?q=*%3A*&wt=json&indent=true

Now, I want to get it into a delete format. I replace the select?q= with update?commit=true&stream.body=<delete><query> and, at the end, the &wt=json&indent=true with </query></delete>.

So I end up with:

http://localhost:8983/solr/my_core_name/update?commit=true&stream.body=<delete><query>*%3A*</query></delete>

Take a deep breath, do whatever you do for good luck, and submit the url (enter key works).

Now, you should be able to go back to the Solr admin page and run the original query and get zero results.

Eolian answered 29/8, 2016 at 16:51 Comment(1)
As others have noted elsewhere here, the stream.body approach is no longer allowed in more recent versions of Solr (we'd get "Stream Body is disabled" with this). I'll offer as a new answer a curl variant different than user3754136 below.Frustrate
G
54

For everyone who doesn't like a lot of words :-)

Solr Admin: remove data from Core

<delete><query>*:*</query></delete>
Graham answered 1/3, 2021 at 13:49 Comment(4)
By the way, it's maybe simpler to use command line: bin/solr delete -c core_nameGraham
have deleted my record using this method but it's still visible when I query it, don't know whyIsidoro
@HermannSchwarz When deleting the core you need the configuration so it is not simplerDollfuss
to cpy: <delete><query>*:*</query></delete>Ebracteate
S
16

curl http://localhost:8080/solr/update -H "Content-type: text/xml" --data-binary '<delete><query>*:*</query></delete>'
curl http://localhost:8080/solr/update -H "Content-type: text/xml" --data-binary '<commit />'
Sphygmomanometer answered 9/3, 2015 at 7:42 Comment(1)
I was getting a 404 until I put the core name in the URL, e.g. http://localhost:8080/solr/my_core_name_here/update...Coonan
L
9

select XML on collection Document tab and update below parameter.

<delete><query>*:*</query></delete>

Licentiate answered 10/2, 2021 at 8:13 Comment(0)
R
4

Under the Documents tab, select "raw XML or JSON" under Document Type and just add the query you need using the unique identifiers for each document.

{'delete': {'query': 'filter(product_id:(25634 25635 25636))'}}

enter image description here

Rani answered 6/10, 2022 at 23:8 Comment(0)
P
1

This solution is only applicable if you are deleting all the documents in multiple collections and not for selective deletion:


I had the same scenario, where I needed to delete all the documents in multiple collections. There were close to 500k documents in each shard and there were multiple shards of each collection. Updating and deleting the documents using the query was a big task and thus followed the below process:

  1. Used the Solr API for getting the details for all the collections -
    http://<solrIP>:<port>/solr/admin/collections?action=clusterstatus&wt=json
    
    This gives the details like name of collection, numShards, configname, router.field, maxShards, replicationFactor, etc.
  2. Saved the output json with the above details in a file for future reference and took the backups of all the collections I needed to delete the documents in, using the following API:
    http://<solr-ip>:<port>/solr/admin/collections?action=BACKUP&name=myBackupName&collection=myCollectionName&location=/path/to/my/shared/drive
    
  3. Further I deleted all the collections which I need to remove all the documents for using the following:
    http://<solr-ip>:<port>/solr/admin/collections?action=DELETEALIAS&name=collectionname
    
  4. Re-created all the collections using the details in the Step 1 and the following API:
    http://<solr-ip>:<port>/solr/admin/collections?action=CREATE&name=collectionname&numShards=number&replicationFactor=number&maxShardsPerNode=number&collection.configName=configname&router.field=routerfield
    

I executed the above steps in loop for all the collections and was done in seconds for around 100 collections with huge data. Plus, I had the backups as well for all the collections.

Refer to this for other Solr APIs: DELETEALIAS: Delete a Collection Alias, Input

Presa answered 15/8, 2019 at 17:45 Comment(0)
E
0

If you want delete some documents by ID you can use the Solr POST tool.

./post -c $core_name ./delete.xml

Where the delete.xml file contains documents ids:

<delete>
  <id>a3f04b50-5eea-4e26-a6ac-205397df7957</id>
</delete>
Echelon answered 24/5, 2022 at 16:18 Comment(0)
F
0

For those who want to automate this sort of request, note first that the stream.body URL-based approach offered by Guy doesn't work (we get "Stream Body is disabled" in reply on recent Solr versions). And as a variant to user3754136's curl, I offer here a different one that worked for me.

I show either using xml or json for the payload, to suit your preference (though of course you should change your port and collectionname per your solr setup). First as XML:

curl -XPOST http://localhost:8995/solr/yourcollection/update?commit=true -d "<delete><query>*:*</query></delete>"

or as json:

curl -XPOST http://localhost:8995/solr/yourcollection/update?commit=true -d "{'delete': {'query': 'filter(*:*)'}}"

And of course either would work as well with some criteria you might offer in the filter, instead of *:*, which is offered here because the original question here asked how to delete ALL documents.

Finally, instead of using curl you could implement this as code in whatever language you want, if you prefer. See especially https://curlconverter.com/ for many available variants, like python, javascript, node.js, go, php, etc.

Frustrate answered 27/3 at 0:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.