What does actually minimum_should_match in percentage work for query search?
Asked Answered
J

1

13

I would to understand more how minimum_should_match works in elasticsearch for a a query search

GET /customers/_search
{
  "query": {
     "bool": {
        "must":[
           {
           "query_string":{
              "query": "大月亮",
              "default_field":"fullName",
              "minimum_should_match": "70%" ------> experimented with this value
           }
        }
      ]
    }
  }
}

I experimented with the percentage in the query and I can see I get different results for Chinese language ?

I tried reading the documentation but did not clearly understand how does this option work ?

Julide answered 23/8, 2019 at 7:16 Comment(0)
I
19

The minimum_should_match parameter works for "should" clauses in the "bool" query. With this parameter you specify how many of the should clauses must match for a document to match the query.

Consider the following query:

{
  "query": {
    "bool" : {
      "must" : {
        "term" : { "user" : "kimchy" }
      },
      "filter": {
        "term" : { "tag" : "tech" }
      },
      "must_not" : {
        "range" : {
          "age" : { "gte" : 10, "lte" : 20 }
        }
      },
      "should" : [
        { "term" : { "tag" : "wow" } },
        { "term" : { "tag" : "elasticsearch" } },
        { "term" : { "tag" : "stackoverflow" } }
      ],
      "minimum_should_match" : 2,
      "boost" : 1.0
    }
  }
}

Here a document will only be a match if minimum 2 should clauses match. This means if a document with both "stackoverflow" and "wow" in the "tags" field will match, but a document with only "elasticsearch" in the tags field will not be considered a match.

When using percentages, you specify the percentage of should clauses that should match. So if you have 4 should clauses and you set the minimum_should_match at 50%, then a document will be considered a match if at least 2 of those should clauses match.

More about minimum_should_match can be found in the documentation. There you can read it's for "optional clauses", which is "should" in a "bool" query.

Iniquity answered 23/8, 2019 at 12:23 Comment(7)
@SorinPenteleiciuc in your case the option is useless, because you only have a "must" clause and no "should" clauses in your "bool" queryIniquity
Not really. I receive different results as minimum_should_match increases the valueJulide
Okay sorry, I now see that the option is used inside "query_string". If you use it there you specify how many terms of the query should match before a document will be considered a match. I don't know how this works with Chinese, because "大月亮" seems like 1 term to me, but maybe it's split into 3 terms, each for one character. It depends on the analyzer used. You should check that with the analyze API.Iniquity
The "minimum_should_match" option in "query_string" is explained in more detail on this page: elastic.co/guide/en/elasticsearch/reference/current/…Iniquity
@Iniquity how can I set "minimum_should_match" while using query DSL? In Java, I found "minimum_should_match_field" accepts string param.Leakage
Assigning "minimum_should_match_field" = "2" or "100%" not works using Query DSL in Java. [ elastic.co/guide/en/elasticsearch/reference/current/… ]Leakage
@PrantaPalit "minimum_should_match_field" does not exist. It should be "minimum_should_match". Could you please try that?Iniquity

© 2022 - 2024 — McMap. All rights reserved.