Trying to set the max_gram and min_gram in Elasticsearch
Asked Answered
R

2

9

Im trying to deploy a Ruby on Rails app on a Ubuntu 16.04 EC2 server but is giving a error about the difference between max_gram and min_gram on Elasticsearch, i don't have any experience with Elasticsearch yet so im totally lost here and i need some guidance to do this and learn how to set it to avoid this problems in the future.

The first time i did the deploy there was a error refusing the connection to localhost:9200 so i had to check if the service was running and even check the firewall and at the end i had to do a clean install and configure everything on elasticsearch.yml and now is running and working but when i try to deploy again is giving me a error, did a lot of search on internet there is a lot of documentation but i still don't get where to set these values.

This is the error im getting on the log:

-----> Migrating database...
rake aborted!
StandardError: An error has occurred, all later migrations canceled:

[400] {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The difference between max_gram and min_gram in NGram Tokenizer must be less than or equal to: [1] but was [49]. This limit can be set by changing the [index.max_ngram_diff] index level setting."}],"type":"illegal_argument_exception","reason":"The difference between max_gram and min_gram in NGram Tokenizer must be less than or equal to: [1] but was [49]. This limit can be set by changing the [index.max_ngram_diff] index level setting."},"status":400}
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/elasticsearch-transport-6.0.2/lib/elasticsearch/transport/transport/base.rb:205:in `__raise_transport_error'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/elasticsearch-transport-6.0.2/lib/elasticsearch/transport/transport/base.rb:323:in `perform_request'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/elasticsearch-transport-6.0.2/lib/elasticsearch/transport/transport/http/faraday.rb:20:in `perform_request'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/elasticsearch-transport-6.0.2/lib/elasticsearch/transport/client.rb:131:in `perform_request'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/elasticsearch-api-6.0.2/lib/elasticsearch/api/namespace/common.rb:21:in `perform_request'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/elasticsearch-api-6.0.2/lib/elasticsearch/api/actions/indices/create.rb:86:in `create'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/searchkick-3.0.2/lib/searchkick/index.rb:16:in `create'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/searchkick-3.0.2/lib/searchkick/index.rb:203:in `create_index'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/searchkick-3.0.2/lib/searchkick/index.rb:270:in `reindex_scope'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/searchkick-3.0.2/lib/searchkick/index.rb:196:in `reindex'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/searchkick-3.0.2/lib/searchkick/model.rb:59:in `searchkick_reindex'
/home/deploy/catalogindustry/releases/20190807135404/db/migrate/20180405153226_validated_true.rb:4:in `change'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:789:in `exec_migration'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:773:in `block (2 levels) in migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:772:in `block in migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/connection_adapters/abstract/connection_pool.rb:398:in `with_connection'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:771:in `migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:951:in `migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1232:in `block in execute_migration_in_transaction'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1302:in `ddl_transaction'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1231:in `execute_migration_in_transaction'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1203:in `block in migrate_without_lock'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1202:in `each'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1202:in `migrate_without_lock'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1150:in `block in migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1319:in `with_advisory_lock'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1150:in `migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1006:in `up'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:984:in `migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/tasks/database_tasks.rb:163:in `migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/railties/databases.rake:58:in `block (2 levels) in '
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/rake-12.3.1/exe/rake:27:in `'
/home/deploy/.rbenv/versions/2.3.1/bin/bundle:23:in `load'
/home/deploy/.rbenv/versions/2.3.1/bin/bundle:23:in `

There is no index files on elasticsearch and there is nothing about this setting on the default template

Russo answered 7/8, 2019 at 13:44 Comment(4)
Can you tell which app this is? Are you using the elasticsearch-ruby client? Which version of ES are you using?Orangy
Yes, elasticsearch-ruby is added on the gemfile and installed, using the current version of ES or 7.3Russo
Updated the log with more lines and the ruby app is a catalog websiteRusso
If you're using ES 7.3, you need to make sure to use the gem version 7.x as wellOrangy
W
16

I have faced a similar issue and below error message is clearly explaining the issue.

[400] {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The difference between max_gram and min_gram in NGram Tokenizer must be less than or equal to: 1 but was [49]. This limit can be set by changing the [index.max_ngram_diff] index level setting."}],"type":"illegal_argument_exception","reason":"The difference between max_gram and min_gram in NGram Tokenizer must be less than or equal to: 1 but was [49]. This limit can be set by changing the [index.max_ngram_diff] index level setting."},"status":400}

Basically, by Default, the difference between max_gram and min_gram in NGram Tokenizer can't be more than 1 and if you want you to change this, then in your index settings you need to change it by adding below setting.

"max_ngram_diff" : "50" --> you can mention this number accoding to your requirement.

Below is my index settings, where you can see I have a difference of 47 in my max_gram and min_gram hence set max_ngram_diff to 50.

{ 
    "settings": {
        "index": {
            "analysis": {
                "analyzer": {
                    "prefix": {
                        "type": "custom",
                        "filter": [
                            "lowercaseFilter"
                        ],
                        "tokenizer": "edgeNGramTokenizer"
                    }
                },
                "tokenizer": {
                    "edgeNGramTokenizer": {
                        "token_chars": [
                            "letter",
                            "digit"
                        ],
                        "min_gram": "1",
                        "type": "edgeNGram",
                        "max_gram": "40"
                    },
                    "loginNGram": {
                        "type": "nGram",
                        "min_gram": "3",
                        "max_gram": "50"
                    }
                }
            },
            "number_of_shards": "1",
            "number_of_replicas": "0",
            "max_ngram_diff" : "50"
        }
    }
} 

Edit: Adding an official Elastic documentation, which explains that default length of max_gram is 2 and min_gram is 1, hence default difference between these can't be more than 1, hence the exception. And then snippet from the same doc

The index level setting index.max_ngram_diff controls the maximum allowed difference between max_gram and min_gram.

Witness answered 7/8, 2019 at 16:41 Comment(4)
Thank you very much for your detailed answer, but when i search for any index with this command: curl -X GET "localhost:9200/_cat/indices?v&pretty" I get as response: health status index uuid pri rep docs.count docs.deleted store.size pri.store.size Meaning there is no index file created, so from where come the error then?Russo
yeah, due to this error, the index wasn't created, hence you need to create the index again with the setting max_ngram_diff.Witness
@CesarRodriguez , any luck ?Witness
Hi Amit sorry for the delay but i was on a travel, i din't found the solution for this issue so i ended up making a new instance and installing everything from scratch and i was able to deploy, but you was right about the file being created with that setting and for some reason on a new instance it was created properly this time so now the website is up and running with no issuesRusso
H
5

One can also use an index template to apply the setting automatically to all new indices:

curl -X PUT "localhost:9200/_index_template/template_1?pretty" -H 'Content-Type: application/json' -d'
{
  "index_patterns": [
      "*"
  ],
  "template": {
    "settings": {
      "index": {
         "max_ngram_diff": 50
      }
    }
  }
}
'

The template will not be deleted by removing every index, but has to be removed manually:

curl -X DELETE "localhost:9200/_index_template/template_1
Hux answered 20/4, 2021 at 14:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.