Highlighting match phrase instead word phrase in ElasticSearch
Asked Answered
D

1

6

We are working with percolators in ElasticSearch, and we need get a complete highlighting by phrase instead word by word:

As example, we have the next search:

curl -X GET "localhost:9200/my-index/_search" -H 'Content-Type: application/json' -d'
{
    "query" : {
        "percolate" : {
            "field": "query",
            "document" : {
                "title" : "A new bonsai tree in the office and jungle. The enojen tree in a kerojen."
            }
        }
    },
    "highlight": {
      "fields": {
        "title": {}
      }
    }
}
'

And we get the next output:

{  
   "took":12,
   "timed_out":false,
   "_shards":{  
      "total":5,
      "successful":5,
      "skipped":0,
      "failed":0
   },
   "hits":{  
      "total":45,
      "max_score":2.1576157,
      "hits":[  
         {  
            "_index":"my-index",
            "_type":"_doc",
            "_id":"2",
            "_score":2.1576157,
            "_source":{  
               "query":{  
                  "match":{  
                     "title":"the enojen tree in a kerojen"
                  }
               }
            },
            "fields":{  
               "_percolator_document_slot":[  
                  0
               ]
            },
            "highlight":{  
               "title":[  
                  "<em>A</em> new bonsai <em>tree</em> <em>in</em> <em>the</em> office and jungle. <em>The</em> <em>enojen</em> <em>tree</em> <em>in</em> <em>a</em> <em>kerojen</em>."
               ]
            }
         },
         {  
            "_index":"my-index",
            "_type":"_doc",
            "_id":"1",
            "_score":2.1576157,
            "_source":{  
               "query":{  
                  "match":{  
                     "title":"the bonsai tree in a jungle"
                  }
               }
            },
            "fields":{  
               "_percolator_document_slot":[  
                  0
               ]
            },
            "highlight":{  
               "title":[  
                  "<em>A</em> new <em>bonsai</em> <em>tree</em> <em>in</em> <em>the</em> office and <em>jungle</em>. <em>The</em> enojen <em>tree</em> <em>in</em> <em>a</em> kerojen."
               ]
            }
         }
      ]
   }
}

As you can see, we are getting all highlighted matches splitted word by word, but we want get something like this: <em>The enojen tree in a kerojen</em>; there is a related issue for it.

We were searching about it and we found this question, but it is related with Sorl.

It says there are two possible parameters (from Lucene) for this purpose: usePhraseHighlighter and mergeContiguous.

So, how can we get a full phrase highlighting result instead highlighting every word individually for a phrase query?

Thank you.

Deneb answered 7/5, 2018 at 15:54 Comment(1)
issue connected to it: github.com/elastic/elasticsearch/issues/29561Frontlet
G
1

For the record, this should now be fixed as of Elsticsearch 8.10, as per https://github.com/elastic/elasticsearch/pull/96068.

Gilgamesh answered 11/12, 2023 at 16:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.