elasticsearch boost importance of exact phrase match
Asked Answered
A

6

26

Is there a way in elasticsearch to boost the importance of the exact phrase appearing in the the document?

For example if I was searching for the phrase "web developer" and if the words "web developer" appeared together they would be boosted by 5 compared to "web" and "developer" appearing separately throughout the document. Thereby any document that contained "web developer" together would appear first in the results.

Adhere answered 28/8, 2013 at 7:19 Comment(0)
A
33

You can combine different queries together using a bool query, and you can assing a different boost to them as well. Let's say you have a regular match query for both the terms, regardless of their positions, and then a phrase query with a higher boost.

Something like the following:

{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "field": "web developer"
          }
        },
        {
          "match_phrase": {
            "field": "web developer",
            "boost": 5
          }
        }
      ],
      "minimum_number_should_match": 1
    }
  }
}
Antiquarian answered 28/8, 2013 at 11:24 Comment(3)
But what happens when I want to perform such query across multiple indices and a varying set of fields? As far as I know match_phrase only works with a specific field name. In my case I need (generically speaking) to use something like { "query_string": { "query": "my exact phrase", "fields": ["typeA.fieldA", "typeB.fieldB"] } }Whyalla
[match_phrase] query doesn't support multiple fieldsMinda
also: [bool] query does not support [minimum_number_should_match]Minda
C
8

As an alternative to javanna's answer, you could do something similar with must and should clauses within a bool query:

{
  "query": {
    "bool": {
      "must": {
          "match": {
            "field": "web developer",
            "operator": "and"
          }
      },
      "should": {
          "match_phrase": {
            "field": "web developer"
          }
      }
    }
  }
}

Untested, but I believe the must clause here will match results containing both 'web' and 'developer' and the should clause will score phrases matching 'web developer' higher.

Costermansville answered 29/8, 2013 at 16:28 Comment(1)
Yes, this does give higher relevance to a document with web developer in it, but the OP wanted to control the relative importance (using the number 5). For example, maybe in a rare case, a document with tons of the tokens web and developer appearing all over could beat out a document with a single web developer. With this answer you give equal importance to both of these queries (ref).Connacht
T
3

You could try using rescore to run an exact phrase match on your initial results. From the docs:

"Rescoring can help to improve precision by reordering just the top (eg 100 - 500) documents returned by the query and post_filter phases, using a secondary (usually more costly) algorithm, instead of applying the costly algorithm to all documents in the index."

https://www.elastic.co/guide/en/elasticsearch/reference/current/filter-search-results.html#rescore

Taciturnity answered 6/2, 2015 at 18:29 Comment(0)
M
2

I used below sample query in my case which is working. It brings exact + fuzzy results but exact ones are boosted!

{ "query": {
"bool": {
  "should": [
    {
      "match": {
        "name": "pala"
      }
    },
    {
      "fuzzy": {
        "name": "pala"
      }
    }
  ]
}}}
Mattland answered 28/5, 2014 at 16:1 Comment(0)
B
2

I do not have enough reputation to comment on James Adison's answer, which I agree with. What is still missing is the boost factor, which can be done using the following syntax:

{
  "match_phrase": 
    {
        "fieldName": {
            "query": "query string for exact match",     
            "boost": 10
         }        

      }
}
Bookmark answered 23/11, 2021 at 12:29 Comment(1)
I like this answer. Can you add the complete solution rather than a fragment of it? My understanding is that this still requires the whole "query": { "bool": { "should": nesting.Polyclitus
A
0

I think its default behaviour already with match query "or" operator. It'll filter phrase "web developer" first and then terms like "web" or "develeper". Though you can boost your query using above answers. Correct me if I'm wrong.

Ashaashamed answered 29/5, 2014 at 6:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.