How to not-analyze in ElasticSearch?
Asked Answered
A

4

50

I've got a field in an ElasticSearch field which I do not want to have analyzed, i. e. it should be stored and compared verbatim. The values will contain letters, numbers, whitespace, dashes, slashes and maybe other characters.

If I do not give an analyzer in my mapping for this field, the default still uses a tokenizer which hacks my verbatim string into chunks of words. I don't want that.

Is there a super simple analyzer which, basically, does not analyze? Or is there a different way of denoting that this field shall not be analyzed?

I only create the index, I don't do anything else. I can use analyzers like "english" for other fields which seems to be built-in names for pre-configured analyzers. Is there a list of other names? Maybe there's one fitting my needs (namely doing nothing with the input).

This is my mapping currently:

{
  "my_type": {
    "properties": {
      "my_field1": { "type": "string", "analyzer": "english" },
      "my_field2": { "type": "string" }
    }
  }
}

my_field1 is language-dependent; this seems to work. my_field2 shall be verbatim. I'd like to give an analyzer there which simply does not do anything.

A sample value for my_field2 would be "B45c 14/04".

Aloysia answered 14/8, 2013 at 15:31 Comment(0)
L
58
"my_field2": {
    "properties": {
        "title": {
            "type": "string",
            "index": "not_analyzed"
        }
    }
}

Check you here, https://www.elastic.co/guide/en/elasticsearch/reference/1.4/mapping-core-types.html, for further info.

Lowpitched answered 14/8, 2013 at 16:59 Comment(4)
Ah! That's what I was looking for. I stumbled across that not_analyzed several times, but always thought using it would mean that it cannot be searched at all (apparently that's what no is used for). The link to the documentation was enlightening, thank you! (And given time I will accept this answer unless sth even more helpful appears.)Aloysia
@Alfe, you can have at this answer for more info including the option index: noLaddie
Can we set it globally so, it doesn't analyze string type for all?Tolson
By default if we dont add 'index': 'not_analyzed' what gonna happen then? does there a default analyzer?Emersion
G
54

This is no longer true due to the removal of the string (replaced by keyword and text) type as described here. Instead you should use keyword type with "index": true | false.

For Example OLD:

{
  "foo": {
    "type" "string",
    "index": "not_analyzed"
  }
}

becomes NEW:

{
  "foo": {
    "type" "keyword",
    "index": true
  }
}

This means the field is indexed but as it is typed as keyword not analyzed implicitly. If you would like to have the field analyzed, you need to use text type.

Gunman answered 13/2, 2018 at 17:49 Comment(4)
It seems it's "enabled": true/false instead of "index" elastic.co/guide/en/elasticsearch/reference/6.5/enabled.htmlMediatize
@Mediatize No, "enabled":false will prevent receiving the field at all (except manual parsing of _source field), while "index":false will prevent only searching/aggregations.Mylor
Besides, enabled is only for object fields or the entire document mapping. Not for keyword fields.Lament
Will index :false for a type: text Still analyze the field?Beeves
L
3

keyword analyser can be also used.

// don't actually use this, use "index": "not_analyzed" instead
{
  "my_type": {
    "properties": {
      "my_field1": { "type": "string", "analyzer": "english" },
      "my_field2": { "type": "string", "analyzer": "keyword" }
    }
  }
}

As noted here: https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-keyword-analyzer.html, it makes more sense to mark those fields as not_analyzed.

But keyword analyzer can be useful when it is set by default for whole index.

UPDATE: As it said in comments, string is no longer supported in 5.X

Langrage answered 7/8, 2015 at 14:21 Comment(2)
actually, the keyword analyzer does analyze the field, but only just once as a whole. it might be not desired when this is setted up on some large text fields like ~MB contents - it will go into the index and eat some resources.Corelli
Moreover, the string field is unsupported for indexes created in 5.x in favor of the text and keyword fields.Karleenkarlen
D
1

for API 8.5 the old answers wont work, and I found the solution by accident, just set property to "enabled=false", check the official doc, there is example inside https://www.elastic.co/guide/en/elasticsearch/reference/current/enabled.html

Deneb answered 17/3, 2023 at 6:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.