Elasticsearch - similary for countries
Asked Answered
A

1

1

I have a document, which contains many fields, one of them is country. There are many documents with the same country.

When I do match query, or fuzzy search against country, and query for Belgium for example, it returns list of documents, which matched Belgium country, but they all have different score. I believe it's because of tdidf similarity and presence of belgium term in other fields of documents, etc.

I'd like it return the same score in this case. What similarity should I use?

Update

I have next 6 documents:

{country:"Austria", title: "house"}
{country:"Austria", title: "Austria village"}
{country: "Germany", title: "deutch hotel" }
{country:"Austria", title: ""}
{country: "USA", title: "Usa hotel" }
{country: "USA", title: "Usa another hotel" }

When I execute match query against country:

{
   query: {match: {country: "Austria"}}
}

I reveice next results:

[ {
  "_index" : "elasticdemo_docs",
  "_type" : "doc",
  "_id" : "1",
  "_score" : 1.0, "_source" : {country:"Austria", title: "Austria village"}
}, {
  "_index" : "elasticdemo_docs",
  "_type" : "doc",
  "_id" : "2",
  "_score" : 0.30685282, "_source" : {country:"Austria", title: "house"}
}, {
  "_index" : "elasticdemo_docs",
  "_type" : "doc",
  "_id" : "3",
  "_score" : 0.30685282, "_source" : {country:"Austria", title: ""}
} ]

I'd like to receive the same _score for all 3 documents, because they all have Austria as a country. What similarity should I use?

Altar answered 25/2, 2014 at 14:13 Comment(6)
What score value are you returning? Percentage, etc.Fork
Is there a reason why you are using a query instead of a filter? Filters won't affect scoring.Nutcracker
It's default score calculated by lucene. I need to use query, because I'm using fuzzy searchAltar
Could you give 2-3 sample documents and a query like you write it?Flybynight
Trying to generate, because on 2-3 documents: {country:"Austria", title: "Austria village"} {country:"Austria", title: "house"} {country:"Austria", title: ""} matching against country returns the same score, I believe it's connected with document structure and number of documentsAltar
I've updated original question with exampleAltar
A
4

Seems I found the problem - it's connected with: http://www.elasticsearch.org/blog/understanding-query-then-fetch-vs-dfs-query-then-fetch/

After using dfs_query_then_fetch search type I've got expected results.

Altar answered 25/2, 2014 at 16:53 Comment(1)
+1, Thanks for following through and sharing what you've found.Dextrogyrate

© 2022 - 2024 — McMap. All rights reserved.