What settings are best for elasticsearch query to find full word and half word
Asked Answered
R

1

0

I can't figure it out what should I set in query. For example, I search 'something sea'. I expect results like 'something sea', 'something tea', also 'something seaside', but last one is never in results. Have fuzziness on auto, match 90%, multi_match, query with 'must', those words are in one field 'name'. I can't find settings to get results with 'something seaside', maybe not even possible? Elasticsearch 7

Romanfleuve answered 28/3 at 15:4 Comment(0)
I
0

One of the ways is the dictionary_decompounder filter. You can use a word list in query or include a word list from file

Mapping

PUT /decompounder
{
    "mappings": {
        "properties": {
            "name": {
                "type": "text",
                "analyzer": "lowercase_english_decompounder_standard_analyzer"
            }
        }
    },
    "settings": {
        "analysis": {
            "analyzer": {
                "lowercase_english_decompounder_standard_analyzer": {
                    "tokenizer": "standard",
                    "filter": [
                        "lowercase",
                        "english_decompounder_filter"
                    ]
                }
            },
            "filter": {
                "english_decompounder_filter": {
                    "type": "dictionary_decompounder",
                    "word_list": [
                        "some",
                        "thing",
                        "sea",
                        "side"
                    ]
                }
            }
        }
    }
}

Documents

PUT /decompounder/_bulk
{"create":{"_id":1}}
{"name":"something sea"}
{"create":{"_id":2}}
{"name":"something tea"}
{"create":{"_id":3}}
{"name":"something seaside"}

Query with your parameters

GET /decompounder/_search?filter_path=hits.hits
{
    "query": {
        "multi_match" : {
            "query": "some sea",
            "analyzer": "lowercase_english_decompounder_standard_analyzer", 
            "fields": ["name"],
            "fuzziness": "auto",
            "minimum_should_match": "90%"
        }
    }
}

Response

{
    "hits" : {
        "hits" : [
            {
                "_index" : "decompounder",
                "_type" : "_doc",
                "_id" : "1",
                "_score" : 0.7876643,
                "_source" : {
                    "name" : "something sea"
                }
            },
            {
                "_index" : "decompounder",
                "_type" : "_doc",
                "_id" : "3",
                "_score" : 0.7876643,
                "_source" : {
                    "name" : "something seaside"
                }
            },
            {
                "_index" : "decompounder",
                "_type" : "_doc",
                "_id" : "2",
                "_score" : 0.5831994,
                "_source" : {
                    "name" : "something tea"
                }
            }
        ]
    }
}
Interfluve answered 29/3 at 8:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.