I am using python. Now I have a mongoDB database collection, in which all documents have such a format:
{"_id":ObjectId("53590a43dc17421e9db46a31"),
"latlng": {"type" : "Polygon", "coordinates":[[[....],[....],[....],[....],[.....]]]}
"self":{"school":2,"home":3,"hospital":6}
}
In which the field "self" indicates the venue types in the Polygon and the number of corresponding venue types. different documents have different self field, such as {"KFC":1,"building":2,"home":6}, {"shopping mall":1, "gas station":2}
Now I need to calculate the cosine similarity between two "self" fields of two documents. Before, all my documents are saved as dictionaries in a pickle file, and I use following codes to calculate the similarity:
vec = DictVectorizer()
total_arrays = vec.fit_transform(data + citymap).A
vector_matrix = total_arrays[:len(data)]
citymap_base_matrix = total_arrays[len(data):]
def cos_cdist(matrix, vector):
v = vector.reshape(1, -1)
return scipy.spatial.distance.cdist(matrix, v, 'cosine').reshape(-1)
for vector in vector_matrix:
distance_result = cos_cdist(citymap_base_matrix,vector)
Here, the data and citymap are just like [{"KFC":1,"building":2,"home":6},{"school":2,"home":3,"hospital":6},{"shopping mall":1, "gas station":2}]
But now I am using mongoDB and I want to know if there is mongoDB method to calculate the similarity in a more straightforward way, any idea?