Currently I am working on the auto suggestion part using lucene in my application. The Auto suggestion of the words are working fine in console application but now i have integerated to the web application but it's not working the desired way.
When the documents are search for the first time with some keywords search and auto suggestion both are working fine and showing the result. But when i search again for some other keyword or same keyword both the auto suggestion as well as Search result are not showing. I am not able to figure out why this weird result is coming.
The snippets for the auto suggestion as well as search are as follows:
final int HITS_PER_PAGE = 20;
final String RICH_DOCUMENT_PATH = "F:\\Sample\\SampleRichDocuments";
final String INDEX_DIRECTORY = "F:\\Sample\\LuceneIndexer";
String searchText = request.getParameter("search_text");
BooleanQuery.Builder booleanQuery = null;
Query textQuery = null;
Query fileNameQuery = null;
try {
textQuery = new QueryParser("content", new StandardAnalyzer()).parse(searchText);
fileNameQuery = new QueryParser("title", new StandardAnalyzer()).parse(searchText);
booleanQuery = new BooleanQuery.Builder();
booleanQuery.add(textQuery, BooleanClause.Occur.SHOULD);
booleanQuery.add(fileNameQuery, BooleanClause.Occur.SHOULD);
} catch (ParseException e) {
e.printStackTrace();
}
Directory index = FSDirectory.open(new File(INDEX_DIRECTORY).toPath());
IndexReader reader = DirectoryReader.open(index);
IndexSearcher searcher = new IndexSearcher(reader);
TopScoreDocCollector collector = TopScoreDocCollector.create(HITS_PER_PAGE);
try{
searcher.search(booleanQuery.build(), collector);
ScoreDoc[] hits = collector.topDocs().scoreDocs;
for (ScoreDoc hit : hits) {
Document doc = reader.document(hit.doc);
}
// Auto Suggestion of the data
Dictionary dictionary = new LuceneDictionary(reader, "content");
AnalyzingInfixSuggester analyzingSuggester = new AnalyzingInfixSuggester(index, new StandardAnalyzer());
analyzingSuggester.build(dictionary);
List<LookupResult> lookupResultList = analyzingSuggester.lookup(searchText, false, 10);
System.out.println("Look up result size :: "+lookupResultList.size());
for (LookupResult lookupResult : lookupResultList) {
System.out.println(lookupResult.key+" --- "+lookupResult.value);
}
analyzingSuggester.close();
reader.close();
}catch(IOException e){
e.printStackTrace();
}
For ex: In first iteration if i search for word "sample"
- Auto suggestion gives me result: sample, samples, sampler etc. (These are the words in the documents)
- Search Result as : sample
But if i search it again with same text or different it's showing no result and also LookUpResult list size is coming Zero.
I am not getting why this is happening. Please help
Below is the updated code for the index creation from set of documents.
final String INDEX_DIRECTORY = "F:\\Sample\\LuceneIndexer";
long startTime = System.currentTimeMillis();
List<ContentHandler> contentHandlerList = new ArrayList<ContentHandler> ();
String fileNames = (String)request.getAttribute("message");
File file = new File("F:\\Sample\\SampleRichDocuments"+fileNames);
ArrayList<File> fileList = new ArrayList<File>();
fileList.add(file);
Metadata metadata = new Metadata();
// Parsing the Rich document set with Apache Tikka
ContentHandler handler = new BodyContentHandler(-1);
ParseContext context = new ParseContext();
Parser parser = new AutoDetectParser();
InputStream stream = new FileInputStream(file);
try {
parser.parse(stream, handler, metadata, context);
contentHandlerList.add(handler);
}catch (TikaException e) {
e.printStackTrace();
}catch (SAXException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
finally {
try {
stream.close();
} catch (IOException e) {
e.printStackTrace();
}
}
FieldType fieldType = new FieldType();
fieldType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
fieldType.setStoreTermVectors(true);
fieldType.setStoreTermVectorPositions(true);
fieldType.setStoreTermVectorPayloads(true);
fieldType.setStoreTermVectorOffsets(true);
fieldType.setStored(true);
Analyzer analyzer = new StandardAnalyzer();
Directory directory = FSDirectory.open(new File(INDEX_DIRECTORY).toPath());
IndexWriterConfig conf = new IndexWriterConfig(analyzer);
IndexWriter writer = new IndexWriter(directory, conf);
Iterator<ContentHandler> handlerIterator = contentHandlerList.iterator();
Iterator<File> fileIterator = fileList.iterator();
Date date = new Date();
while (handlerIterator.hasNext() && fileIterator.hasNext()) {
Document doc = new Document();
String text = handlerIterator.next().toString();
String textFileName = fileIterator.next().getName();
String fileName = textFileName.replaceAll("_", " ");
fileName = fileName.replaceAll("-", " ");
fileName = fileName.replaceAll("\\.", " ");
String fileNameArr[] = fileName.split("\\s+");
for(String contentTitle : fileNameArr){
Field titleField = new Field("title",contentTitle,fieldType);
titleField.setBoost(2.0f);
doc.add(titleField);
}
if(fileNameArr.length > 0){
fileName = fileNameArr[0];
}
String document_id= UUID.randomUUID().toString();
FieldType documentFieldType = new FieldType();
documentFieldType.setStored(false);
Field idField = new Field("document_id",document_id, documentFieldType);
Field fileNameField = new Field("file_name", textFileName, fieldType);
Field contentField = new Field("content",text,fieldType);
doc.add(idField);
doc.add(contentField);
doc.add(fileNameField);
writer.addDocument(doc);
analyzer.close();
}
writer.commit();
writer.deleteUnusedFiles();
long endTime = System.currentTimeMillis();
writer.close();
Also i have observed that from second search iteration the files in the index directory are getting deleted and only the file with .segment suffix is getting changes like .segmenta, .segmentb, .segmentc etc..
I dont know why this weird situation is happening.