How to remove all documents in index
Asked Answered
D

4

17

Is there any simple way to remove all documents (or a filtered list or documents) from an Azure search index?

I know the obvious answer is to delete and recreate the index, but I'm wondering if there is any other option.

Delitescent answered 4/2, 2015 at 16:54 Comment(0)
F
12

No, currently there's no way to delete all the documents from an index. As you suspected deleting and re-creating the index is the way to go. For really small indexes you could consider deleting documents individually but given that often apps have code for index creation already, delete/recreate is the quickest path.

Factotum answered 4/2, 2015 at 17:54 Comment(1)
It would be great if you could add this as an item to our UserVoice page (feedback.azure.com/forums/263029-azure-search) as we have heard this question a few times and it would be very helpful to track the importance of this for yourself and others given Pablo's workaround. One option we might consider is to also allow deletion leveraging $filter so that you can not only delete all documents but also specific documents.Robichaud
C
6

There is a way: query all documents, and use IndexBatch to delete these guys.

    public void DeleteAllDocuments()
    {
        // Get index
        SearchIndexClient indexClient = new SearchIndexClient(SearchServiceName, SearchServiceIndex, new SearchCredentials(SearchServiceQueryApiKey));

        // Query all
        DocumentSearchResult<Document> searchResult;
        try
        {
            searchResult = indexClient.Documents.Search<Document>(string.Empty);
        }
        catch (Exception ex)
        {
            throw new Exception("Error during AzureSearch");
        }

        List<string> azureDocsToDelete =
            searchResult
                .Results
                .Select(r => r.Document["id"].ToString())
                .ToList();

        // Delete all
        try
        {
            IndexBatch batch = IndexBatch.Delete("id", azureDocsToDelete);
            var result = indexClient.Documents.Index(batch);
        }
        catch (IndexBatchException ex)
        {
            throw new Exception($"Failed to delete documents: {string.Join(", ", ex.IndexingResults.Where(r => !r.Succeeded).Select(r => r.Key))}");
        }
    }
Chiapas answered 18/12, 2017 at 14:5 Comment(1)
This has two major problems: 1. In the general case it isn't possible to enumerate all documents in a large index in a consistent manner. 2. It is very inefficient due to the way that deletion actually works at the Lucene level. "Deleting" a document just marks it as deleted so that it will eventually go away when segments are merged. Doing this a lot creates a lot of work for the segment merger and consumes more storage in the short term instead of freeing it up. Pablo's answer is the best -- Delete the whole index and recreate it.Leshia
P
1
    //searchIndex - Index Name
    //KeyCol - Key column in the index
    public static void ResetIndex(string searchIndex, string KeyCol)
    {
        while (true)
        {
            SearchIndexClient indexClient = new SearchIndexClient(searchServiceName, searchIndex, new SearchCredentials(apiKey));
            DocumentSearchResult<Document> searchResult = indexClient.Documents.Search<Document>(string.Empty);
            List<string> azureDocsToDelete =
            searchResult
                .Results
                .Select(r => r.Document[KeyCol].ToString())
                .ToList();
            if(azureDocsToDelete.Count != 0)
            {
                try
                {
                    IndexBatch batch = IndexBatch.Delete(KeyCol, azureDocsToDelete);
                    var result = indexClient.Documents.Index(batch);
                }
                catch (IndexBatchException ex)
                {
                    throw new Exception($"Failed to reset the index: {string.Join(", ", ex.IndexingResults.Where(r => !r.Succeeded).Select(r => r.Key))}");
                }
            }
            else
            {
                break;
            }
        }
    }
Perineuritis answered 28/8, 2018 at 9:16 Comment(0)
H
1

It's 2023, and it appears this is still not directly possible.

Instead, you have to copy the index schema, delete the entire index, then create a new index with the same name using the copied schema.

I have a node.js script that can be used to to accomplish this easily with a single command:

const axios = require('axios');

const serviceName = 'xxxxx';
const apiKey = 'xxxxx';

const headers = {
  'Content-Type': 'application/json',
  'api-key': apiKey,
};

// example run: node script.js restartIndex indexName
async function restartIndex(indexName) {
  // Get the schema of the existing index
  axios.get(`https://${serviceName}.search.windows.net/indexes/${indexName}?api-version=2020-06-30`, { headers })
    .then((response) => {
      const schema = response.data;
      delete schema.statistics; // Remove statistics from the old index

      // Delete the existing index
      axios.delete(`https://${serviceName}.search.windows.net/indexes/${indexName}?api-version=2020-06-30`, { headers })
        .then(() => {
          // Recreate the index with the fetched schema
          axios.put(`https://${serviceName}.search.windows.net/indexes/${indexName}?api-version=2020-06-30`, schema, { headers })
            .then((response) => {
              console.log('Index restarted successfully:', response.data);
            })
            .catch((error) => {
              console.error('Error creating new index:', error);
            });
        })
        .catch((error) => {
          console.error('Error deleting old index:', error);
        });
    })
    .catch((error) => {
      console.error('Error fetching old index:', error);
    });
}

const args = process.argv.slice(2);

switch (args[0]) {
  case 'restartIndex':
    restartIndex(args[1]);
    break;
  default:
    console.log('Unknown command');
    break;
}

You'll need to run

npm install axios

Once you have this setup, you can run:

node script.js restartIndex indexName

Where script.js is the name that you gave this file, and indexName is the name of your index.

Heteroplasty answered 16/8, 2023 at 11:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.