Elasticsearch "match_phrase" query and "fuzzy" query - can both be used in conjunction
Asked Answered
P

1

0

I need a query using match_phrase along with fuzzy matching. However I'm not able to find any documentation to construct such a query. Also, when I try combining the queries(one within another), it throws errors. Is it possible to construct such a query?

Pleurisy answered 29/11, 2018 at 14:18 Comment(2)
multi_match might work since it accepts both a phrase type query as well as fuzzyness, though there's a chance the phrase query also accepts fuzzyness since it basically extends the match queryPurchase
hey @Purchase multi_match doesn't support fuzzy with match phrase as mentioned in this link elastic.co/guide/en/elasticsearch/reference/current/…. I think in ES 6.x versions, the only way to implement fuzzy search using match_phrase is to make use of Span Queries. If it is a single field fuzzy search we can make use of fuzzy query as mentioned in this link: elastic.co/guide/en/elasticsearch/reference/current/…Malaspina
M
1

You would need to make use of Span Queries.

The below query would perform phrase match+fuzzy query for champions league say for e.g. on a sample field name which is of type text

If you'd want multiple fields, then add another must clause.

Notice I've mentioned slop:0 and in_order:true which would do exact phrase match, while you achieve fuzzy behaviour using fuzzy queries inside match query.

Sample Documents

POST span-index/mydocs/1
{
  "name": "chmpions leage"
}

POST span-index/mydocs/2
{
  "name": "champions league"
}

POST span-index/mydocs/3
{
  "name": "chompions leugue"
}

Span Query:

POST span-index/_search
{  
   "query":{  
      "bool":{  
         "must":[  
            {  
               "span_near":{  
                  "clauses":[  
                     {  
                        "span_multi":{  
                           "match":{  
                              "fuzzy":{  
                                 "testField":"champions"
                              }
                           }
                        }
                     },
                     {  
                        "span_multi":{  
                           "match":{  
                              "fuzzy":{  
                                 "testField":"league"
                              }
                           }
                        }
                     }
                  ],
                  "slop":0,
                  "in_order":true
               }
            }
         ]
      }
   }
}

Response:

{
  "took": 19,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 0.5753642,
    "hits": [
      {
        "_index": "span-index",
        "_type": "mydocs",
        "_id": "2",
        "_score": 0.5753642,
        "_source": {
          "name": "champions league"
        }
      },
      {
        "_index": "span-index",
        "_type": "mydocs",
        "_id": "1",
        "_score": 0.5753642,
        "_source": {
          "name": "chmpions leage"
        }
      },
      {
        "_index": "span-index",
        "_type": "mydocs",
        "_id": "3",
        "_score": 0.5753642,
        "_source": {
          "name": "chompions leugue"
        }
      }
    ]
  }
}

Let me know if this helps!

Malaspina answered 29/11, 2018 at 15:57 Comment(3)
So we need to divide the query like "champions league" to ["champions", "league"] then form a DSL query?Kitsch
@Kitsch yes that's correct. You can see how the query is constructed for every word. In a way Span Queries, although much verbose and longer, is more flexible.Malaspina
Thanks @Karmal, I try this solution but the fuzzy query makes "best car" hits "best cat". Still a long way to go.Kitsch

© 2022 - 2024 — McMap. All rights reserved.