How to dynamically change synonyms for ElasticSearch
Asked Answered
B

6

9

My synonyms are stored in a database and, when the synonyms are changed in the database, I want to update any values in the index which may be changed as a result of the synonym change.

There are two parts to this that I can think of. One, figuring out which documents to re-index. Two, figuring out how to tell ElasticSearch that the synonyms have changed. I am struggling with the 2nd one - telling ElasticSearch that the synonyms have changed.

A similar question has been asked - see Change dynamically elasticsearch synonyms - but from reading the answers in that issue, I have not been able to figure out what I need.

Currently, my configuration file looks something like the following:

index :
  analysis :
    analyzer :
      myanalyzer :
        filter: [standard, mysynonymfilter]
filter :
  mysynonymfilter :      
    type : synonym
    synonyms : synonyms.txt
    ignore_case : false
    expand : true
    format : solr

My idea was to do something like the following:

curl -XPUT 'http://127.0.0.1:9200/foo/_settings'  -d '
{
    "filter" : {
        "synonym" : {
            "type" : "mysynonymfilter",
            "synonyms" : [
                "cosmos, universe"
            ] 
        }
    }
}
'

but that doesn't seem to do what I want. That is, the index settings do not get updated as far as I can tell.

Is what I am trying to do possible? And if so, any idea what I am doing wrong?

Also, I am fairly sure I could get this to work by updating the synonym file (if I have to use a file), but that's a bit more complicated and something I'd like to avoid.

Thanks for your help, Eric

Borlow answered 27/8, 2013 at 22:49 Comment(0)
B
11

It turns out that you can tell ElasticSearch programmatically that the synonyms have changed. That is, it is not necessary to update the synonym file. Here's the basic steps that are necessary:

  • Close the index.
  • Update the index settings with the new synonym list. To be safe, I am updating all of the analyzers, tokenizers and char filters for the index (not just the synonym filter) - but I am not sure that is necessary.
  • Open the index.
Borlow answered 21/11, 2014 at 16:56 Comment(2)
how fast this method is ??Mastoidectomy
Why is this weird way of doing it having 10 upvotes?Student
S
7

I know this is an old thread but as of ES 7.5 they have added a new feature to update synonyms. Have a look at their documentation.

You need to issue a POST api like this POST /twitter/_reload_search_analyzers

This would reload all the search analyzers, also ensure that the synonym token filter have the updateable flag set to true like this "updatedable": true.

PS: This feature is part of X-Pack and comes under the basic license which is free.

Stunsail answered 17/1, 2020 at 19:45 Comment(0)
P
3

I know this is an old thread, but in case it helps someone. The answer can be found here:

If you specify stopwords inline with the stopwords parameter, your only option is to close the index and update the analyzer configuration with the update index settings API, then reopen the index.

Updating stopwords is easier if you specify them in a file with the stopwords_path parameter. You can just update the file (on every node in the cluster) and then force the analyzers to be re-created by either of these actions:

Closing and reopening the index (see open/close index), or Restarting each node in the cluster, one by one

Parenthesis answered 12/7, 2017 at 7:4 Comment(1)
It's worth noting that this procedure for stopwords is the recommendation from the docs on Using Synonyms for updating synonyms as well: "See Updating Stopwords for techniques that can be used to refresh the synonyms list."Monticule
R
2

There is a project for reloading the synonym file lindstromhenrik/elasticsearch-analysis-file-watcher-synonym Although I don't know if it works in the latest versions. Maybe you can start by using the plugin and expanding synonyms at query time, since at query time you will have all the synonyms updated instead of reindexing documents that you guess that should be updated because of changes in the synonyms file.

Roxane answered 28/8, 2013 at 7:13 Comment(1)
Thanks for the response, but I believe I know how to solve my issue if I update the synonym file directly, but my question was intended to deal with not having to update the synonym file (and instead updating the synonyms via a REST command).Borlow
R
1

You've flipped mysynonymfilter and synonym in your final curl command. The type should be synonym.

Removed answered 29/10, 2015 at 3:4 Comment(0)
L
0

In 2023 using 7.16.3 version and putting *.txt synonym file in {ES_HOME}/config/analysis folder;

Below is my index settings to create the schema.

PUT 'http://localhost:9200/indexName'

{
  "settings": {
    "index.number_of_replicas": 0,
    "index.max_ngram_diff": 15,
    "index.default_pipeline": "my_pipeline",
    "analysis": {
      "analyzer": {
        "autocomplete": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "autocomplete_filter"
          ]
        },
        "standardTokenizer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase"
          ]
        },
        "mySynonymAnalyzer":{
          "tokenizer":"standard",
          "filter":[  
            "mySynonymFilter",
            "lowercase"
          ]
        }
      },
      "filter": {
        "autocomplete_filter": {
          "type": "ngram",
          "min_gram": 2,
          "max_gram": 15
        },
        "mySynonymFilter":{
          "type":"synonym",
          "synonyms_path":"analysis/synonym.txt",
          "updateable": true
        }
      }
    }
  }
}

Below is my mappings to define the usage of synonym analyzer in the actual field

PUT 'http://localhost:9200/offers/_mapping'

{
  "properties": {
    "myField": {
      "type": "text",
      "analyzer": "standardTokenizer",
      "search_analyzer": "mySynonymAnalyzer",
      "fields": {
        "keyword": {
          "type": "keyword"
        }
      }
    }
  }
}

Here we need to keep in mind that, synonym analyzer is updateable hence it has to be used at search_analyzer phase not at the index phase. Otherwise, there will be error.

In order to force update the synonyms list, update synonym.txt file on each node of elasticsearch, then consume below 2 APIs to update synonyms list. It is a required step after updating synonyms.

POST 'http://localhost:9200/indexName/_reload_search_analyzers'
POST 'http://localhost:9200/indexName/_cache/clear?request=true'
Logo answered 8/11, 2023 at 14:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.