How to identify if a Lucene.Net Index exists in a folder?
Asked Answered
C

6

21

I am using Lucene.Net for indexing and searching documents, and I am using the following code to create or open an index if one exists:

IndexWriter writer = new IndexWriter(@"C:\index", new StandardAnalyzer(), !IndexExists);

...

private bool IndexExists
{
    get
    {
        return ??
    }
}

Now, how can implement IndexExists in a simple way? I don't need any exceptions to be thrown.

Cheesy answered 16/6, 2009 at 14:23 Comment(0)
I
38

The static method IndexReader.IndexExists(string path) (or one of its overloads) seems pretty suitable.

Intracranial answered 16/6, 2009 at 14:38 Comment(1)
This is called DirectoryReader.indexExists now: lucene.apache.org/core/8_1_0/core/org/apache/lucene/index/…Haunch
C
7

In < 4.0 is IndexReader.indexExists(org.apache.lucene.store.Directory)

In > 4.0 is DirectoryReader.indexExists(org.apache.lucene.store.Directory)

Cathodoluminescence answered 4/3, 2013 at 11:15 Comment(0)
U
3

You could just use the constructor that doesn't take a boolean param. That will open an existing index if there is one there or create a new one if it doesn't exist.

Java documentation link (same for Lucene.Net): http://lucene.apache.org/java/2_3_1/api/org/apache/lucene/index/IndexWriter.html#IndexWriter(org.apache.lucene.store.Directory, org.apache.lucene.analysis.Analyzer)

Underachieve answered 17/6, 2009 at 15:31 Comment(3)
Lucene.net does not have this overload.Framing
Which version of Lucene.Net is missing the overload? It's there in 2.4.Underachieve
I'm using the "straight Java" Lucene. IndexWriter in 4.10.+ only has one constructor. But I don't understand how you can get what the questioner wanted from your solution: find out whether or not there's an index there already.Wellfavored
R
0

I try to find this anwser too without success and here is how I used in my code:

private bool IndexExists { get { return IndexDirectory.FileExists("segments.gen"); } }

Retroversion answered 6/7, 2011 at 14:26 Comment(1)
Clever... and possibly the best way currently (see my answer: 4.10.+ has changed the specification of DirectoryReader.indexExists() since 4.0.+). But of course, as I'm sure you're aware, your solution is very vulnerable to version changes. Each new version will have to be checked!Wellfavored
S
0

I know that this is an old entry, but what Sean Carpenter posted is totally right and this constructor exists even in the latest version of Lucene .NET. The documentation for the IndexWriter class can be found here: http://lucenenet.apache.org/docs/3.0.3/d2/d1d/class_lucene_1_1_net_1_1_index_1_1_index_writer.html#af4620c14320934601058e0e9cac9bfab

Singularize answered 9/8, 2013 at 15:14 Comment(0)
W
0

Whoops!

This is "straight Java" Lucene, but it may well apply to other varieties.

In Lucene 4.0.0 the API for DirectoryReader.indexExists() says

Returns true if an index exists at the specified directory.

But in Lucene 4.10.2 the API for DirectoryReader.indexExists() says

Returns true if an index likely exists at the specified directory. Note that if a corrupt index exists, or if an index in the process of committing

... yes, it breaks off mid-sentence. NB I compiled my Javadoc direct from the source, but the same unfinished phrase can be seen in the online API. Not only that, but I looked at the Lucene 6.0.0 API, and it is exactly the same.

The "returns" phrase is however:

true if an index exists; false otherwise

... but I currently believe an empty directory will sometimes (?) return true (from my unit testing). Anyway, I wouldn't trust it.

If you create an IndexReader on an empty directory, it appears that all its methods will return without throwing exceptions. You can go indexReader.numDocs(), and this will return 0, but that doesn't prove that there is no index there, only that there are no Documents. Depending on your requirements that might be enough, of course.

Similarly, you can create an IndexSearcher from such an IndexReader, and you can create an IndexWriter. None of these will have any apparent problem with an empty directory.

BETTER SOLUTION:

    try {
        directoryReader = DirectoryReader.open( fsDir );
    } catch ( org.apache.lucene.index.IndexNotFoundException e) {
        ...
    }

This appears, as far as I can tell, to be reliable.

Wellfavored answered 3/11, 2016 at 17:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.