This seems so simple that I'm convinced I must be overlooking something. I cannot establish how to do the following in Lucene:
The problem
- I'm searching for place names.
- I have a field called
Name
- It is using
Lucene.Net.Analysis.Standard.StandardAnalyzer
- It is
TOKENIZED
- The value of
Name
contains 1 space in the value:halong bay
. - The search term may or may not contain an extra space due to culturally different spellings or genuine spelling mistakes. E.g.
ha long bay
instead ofhalong bay
. - If I use the term
halong bay
I get a hit. - If I use the term
ha long bay
I do not get a hit.
The attempted solution
Here's the code I'm using to build my predicate using LINQ to Lucene from Sitecore:
var searchContext = ContentSearchManager.GetIndex("my_index").CreateSearchContext();
var term = "ha long bay";
var predicate = PredicateBuilder.Create<MySearchResultItemClass>(sri => sri.Name == term);
var results = searchContext.GetQueryable<MySearchResultItemClass>().Where(predicate);
I have also tried a fuzzy match using the .Like()
extension:
var predicate = PredicateBuilder.Create<MySearchResultItemClass>(sri => sri.Like(term));
This also yields no results for ha long bay
.
How do I configure Lucene in Sitecore to return a hit for both halong bay
and ha long bay
search terms, ideally without having to do anything fancy with the input term (e.g. stripping space, adding wildcards, etc)?
Note: I recognise that this would also allow the term h a l o n g b a y
to produce a hit, but I don't think I have a problem with this.