I have a requirement to index a series of key phrases assigned to articles. The phrases are stored as a string with a \r\n delimiter and one phrase may contain another phrase, for example:
This is a key phrase
This is a key phrase too
This is also a key phrase
Would be stored as
keywords: "This is a key phrase\r\nThis is a key phrase too\r\nThis is also a key phrase"
An article which has only the phrase This is a key phrase too
should not be matched when a search for This is a key phrase
is performed.
I have a custom indexer implementing ISimpleDataService
which works fine and indexes the content, but I can't work out how to get a query such as "This is a key phrase" to return results.
From what I've read, I thought the default QueryParser
should split on delimiters and see each entry as a separate value, but it doesn't seem to work that way.
Although I've tried various implementations, my current search code looks like this:
var searcher = ExamineManager.Instance.SearchProviderCollection["KeywordsSearcher"];
var searchCriteria = searcher.CreateSearchCriteria(BooleanOperation.Or);
var query = searchCriteria.Field("keywords", keyword).Compile();
var searchResults = searcher.Search(query).OrderByDescending(x => x.Score).ToList();
The 'simple' way I thought to do this was to add each keyword as a separate 'keyword' field, but the SimpleDataSet
provided as part of the .NET implementation uses a Dictionary<string, string>
, which precludes me from being able to have more than one key with the same name.
I'm new to Lucene and Umbraco, so any advice would be gratefully received.