How can I write an Elasticsearch terms aggregation that splits the buckets by the entire term rather than individual tokens? For example, I would like to aggregate by state, but the following returns new, york, jersey and california as individual buckets, not New York and New Jersey and California as the buckets as expected:
curl -XPOST "http://localhost:9200/my_index/_search" -d'
{
"aggs" : {
"states" : {
"terms" : {
"field" : "states",
"size": 10
}
}
}
}'
My use case is like the one described here https://www.elastic.co/guide/en/elasticsearch/guide/current/aggregations-and-analysis.html with just one difference: the city field is an array in my case.
Example object:
{
"states": ["New York", "New Jersey", "California"]
}
It seems that the proposed solution (mapping the field as not_analyzed) does not work for arrays.
My mapping:
{
"properties": {
"states": {
"type":"object",
"fields": {
"raw": {
"type":"object",
"index":"not_analyzed"
}
}
}
}
}
I have tried to replace "object" by "string" but this is not working either.
.raw
. That is because I had tried so many different combinations of mappings and searches and ended up posting that one. Your answer led me to detect that my real problem is, that I am using the elasticsearch-transport-couchbase plugin to import my documents into Elasticsearch and the plugin changes my document structure, surrounding it with a"doc"
attribute. Thanks to your answer, I added a document manually, and it worked, and that's how I detected the surrounding "doc" attribute in the other documents. – Realpolitik