Phrase matching with Sitecore ContentSearch API
Asked Answered
B

2

9

I am using Sitecore 7.2 with a custom Lucene index and Linq. I need to give additional (maximum) weight to exact matches.

Example: A user searches for "somewhere over the rainbow"

Results should include items which contain the word "rainbow", but items containing the exact and entire term "somewhere over the rainbow" should be given maximum weight. They will displayed to users as the top results. i.e. An item containing the entire phrase should weigh more heavily than an item which contains the word "rainbow" 100 times.

I may need to handle ranking logic outside of the ContentSearch API by collecting "phrase matches" separately from "wildcard matches", and that's fine.

Here's my existing code, truncated for brevity. The code works, but exact phrase matches are not treated as I described.

using (var context = ContentSearchManager.GetIndex("sitesearch-index").CreateSearchContext())
{
    var pred = PredicateBuilder.False<SearchResultItem>();
    pred = pred
        .Or(i => i.Name.Contains(term)).Boost(1)
        .Or(i => i["Field 1"].Contains(term)).Boost(3)
        .Or(i => i["Field 2"].Contains(term)).Boost(1);

    IQueryable<SearchResultItem> query = context.GetQueryable<SearchResultItem>().Where(pred);
    var hits = query.GetResults().Hits;
    ...
}

How can I perform exact phrase matching and is it possible with the Sitecore.ContentSearch.Linq API?

Breeching answered 14/6, 2016 at 18:53 Comment(2)
Try to look on usage of PreparedQuery rather than PredicateBuilder.Hinder
It looks like PreparedQuery is from the Sitecore.Search API in Sitecore 6, not the ContentSearch API in 7. My current custom index is defined in in a standalone config file under the <contentSearch> node. SearchManager.GetIndex() (used in v6) is not aware of my index. It looks like I would have to change everything in order to use the v6 API. Hoping for an alternative.Breeching
B
1

Answering my own question. The problem was with the parenthesis syntax. It should be

.Or(i => i.Name.Contains(term).Boost(1))

rather than

.Or(i => i.Name.Contains(term)).Boost(1)

The boosts were not being observed.

Breeching answered 17/8, 2016 at 20:19 Comment(0)
S
0

I think if you do the following it will solve this:

  • Split your search string on space
  • Create a predicate for each split with an equal boost value,
  • Create an additional predicate with the complete search string and with higher boost value
  • combine all these predicates in one "OR" predicate.

Also I recommend you to check the following:

Sitecore Solr Search Score Value

http://sitecoreinfo.blogspot.com/2015/10/sitecore-solr-search-result-items.html

Saltatory answered 16/6, 2016 at 21:35 Comment(1)
This doesn't work as expected. I created an additional predicate with the complete search string and set the boost value to 30 (30 times higher). When I search for the phrase "set of guiding principles", the highest scores are items containing many instances of the words "set", "settings", and even "Massachusetts". Items containing the entire phrase rank lower. I am using .Contains() in my predicates, as shown above.Breeching

© 2022 - 2024 — McMap. All rights reserved.