I'm using the ElasticSearch (2.4) and the official Python client to perform simple queries. My code:
from elasticsearch import Elasticsearch
es_client = Elasticsearch("localhost:9200")
index = "indexName"
doc_type = "docType"
def search(query, search_size):
body = {
"fields": ["title"],
"size": search_size,
"query": {
"query_string": {
"fields": ["file.content"],
"query": query
}
}
}
response = es_client.search(index=index, doc_type=doc_type, body=body)
return response["hits"]["hits"]
search("python", 10) # Works fine.
The problem is when my query contains unbalanced parenthesis or brackets. For example with search("python {programming", 10)
ES throws:
elasticsearch.exceptions.RequestError: TransportError(400, u'search_phase_execution_exception', u'Failed to parse query [python {programming}]')
Is that the expected behavior of ES? Doesn't it use a tokenizer to remove all those characters?
Note: This happens to me using Java too.
re.sub('(\+|\-|\=|&&|\|\||\>|\<|\!|\(|\)|\{|\}|\[|\]|\^|"|~|\*|\?|\:|\\\|\/)', '\\\\\\1', query)
– Roughandtumble