It looks like scipy.spatial.distance.cdist cosine similariy distance:
1 - u*v/(||u||||v||)
is different from sklearn.metrics.pairwise.cosine_similarity which is
u*v/||u||||v||
Does anybody know reason for different definitions?
It looks like scipy.spatial.distance.cdist cosine similariy distance:
1 - u*v/(||u||||v||)
is different from sklearn.metrics.pairwise.cosine_similarity which is
u*v/||u||||v||
Does anybody know reason for different definitions?
Good question but yes, these are 2 different things but connected by the following equation:
Cosine_distance = 1 - cosine_similarity
Why?
Usually, people use the cosine similarity as a similarity metric between vectors. Now, the distance can be defined as 1-cos_similarity.
The intuition behind this is that if 2 vectors are perfectly the same then similarity is 1 (angle=0) and thus, distance is 0 (1-1=0).
Similarly you can define the cosine distance for the resulting similarity value range.
Cosine similarity range: −1 meaning exactly opposite, 1 meaning exactly the same, 0 indicating orthogonality.
cosine_similarity
is under sklearn.metrics
while not being a metric –
Mali © 2022 - 2024 — McMap. All rights reserved.