How do I get Lucene (.NET) to highlight correctly with wildcards?
Asked Answered
D

1

6

I am using the Lucene.NET API directly in my ASP.NET/C# web application. When I search using a wildcard, like "fuc*", the highlighter doesn't highlight anything, but when I search for the whole word, like "fuchsia", it highlights fine. Does Lucene have the ability to highlight using the same logic it used to match with?

Various maybe-relevant code-snippets below:

var formatter = new Lucene.Net.Highlight.SimpleHTMLFormatter(
    "<span class='srhilite'>",
    "</span>");

var fragmenter = new Lucene.Net.Highlight.SimpleFragmenter(100);
var scorer = new Lucene.Net.Highlight.QueryScorer(query);
var highlighter = new Lucene.Net.Highlight.Highlighter(formatter, scorer);
highlighter.SetTextFragmenter(fragmenter);

and then on each hit...

string description = Server.HtmlEncode(doc.Get("Description"));
var stream = analyzer.TokenStream("Description", 
    new System.IO.StringReader(description));
string highlighted_text = highlighter.GetBestFragments(
    stream, description, 1, "...");

And I'm using the QueryParser and the StandardAnalyzer.

Delaware answered 14/5, 2010 at 21:7 Comment(0)
T
4

you'll need to ensure you set the parser rewrite method to SCORING_BOOLEAN_QUERY_REWRITE.

This change seems to have become necessary since Lucene v2.9 came along.

Hope this helps,

Text answered 14/5, 2010 at 21:48 Comment(3)
Errr... how? From what I've seen in the docs, I need a MultiTermQuery to mess with that, but I only have a Query. Should I test for typeof MultiTermQuery and cast up?Delaware
I blindly tried: query = parser.Parse(searchText); if (query.GetType() == typeof(Lucene.Net.Search.PrefixQuery)) { ((Lucene.Net.Search.PrefixQuery)query).SetRewriteMethod(Lucene.Net.Search.PrefixQuery.SCORING_BOOLEAN_QUERY_REWRITE); } and it made things worse.Delaware
I actually meant to set the rewrite style on the parser. i.e. using the SetMultiTermRewriteMethod method of the parser object. HTHText

© 2022 - 2024 — McMap. All rights reserved.