How to calculate the Cosine similarity between two tensors?
Asked Answered
T

4

20

I have two normalized tensors and I need to calculate the cosine similarity between these tensors. How do I do it with TensorFlow?

cosine(normalize_a,normalize_b)

    a = tf.placeholder(tf.float32, shape=[None], name="input_placeholder_a")
    b = tf.placeholder(tf.float32, shape=[None], name="input_placeholder_b")
    normalize_a = tf.nn.l2_normalize(a,0)        
    normalize_b = tf.nn.l2_normalize(b,0)
Thorwald answered 11/4, 2017 at 23:18 Comment(0)
E
31

This will do the job:

a = tf.placeholder(tf.float32, shape=[None], name="input_placeholder_a")
b = tf.placeholder(tf.float32, shape=[None], name="input_placeholder_b")
normalize_a = tf.nn.l2_normalize(a,0)        
normalize_b = tf.nn.l2_normalize(b,0)
cos_similarity=tf.reduce_sum(tf.multiply(normalize_a,normalize_b))
sess=tf.Session()
cos_sim=sess.run(cos_similarity,feed_dict={a:[1,2,3],b:[2,4,6]})

This prints 0.99999988

Emulation answered 12/4, 2017 at 1:24 Comment(4)
Thank you a lot for your answer. Is the cosine similarity formula getting simplified by normalizing the inputs first? your formula seems to have less things than the one from Wikipedia en.wikipedia.org/wiki/Cosine_similarityThorwald
If you will not normalize first, then after you calculate the inner product a*b you have to divide by the product of the norms of a and b. However, if you normalize in advance, you don't need to do that. This is because normalize_a=a/||a|| (and similarly for b).Emulation
why not matmul?Selmaselman
tf.matmul() is matrix multiplication, tf.multiply() is element-wise multiplicationTurpin
A
25

Times change. With the latest TF API, this can be computed by calling tf.losses.cosine_distance.

Example:

import tensorflow as tf
import numpy as np


x = tf.constant(np.random.uniform(-1, 1, 10)) 
y = tf.constant(np.random.uniform(-1, 1, 10))
s = tf.losses.cosine_distance(tf.nn.l2_normalize(x, 0), tf.nn.l2_normalize(y, 0), dim=0)
print(tf.Session().run(s))

Of course, 1 - s is the cosine similarity!

Alvera answered 5/9, 2017 at 14:35 Comment(6)
why is 1-s the cosine similarity?Selmaselman
because s is the cosine distance, not the similarity.Alvera
The 1-s isn't needed. The function is called distance, but returns similarity. I think because it's in tf.losses. Have a look at the code, i might be wrong. Line 274. losses = 1 - math_ops.reduce_sum(radial_diffs, axis=(dim,), keep_dims=True) github.com/tensorflow/tensorflow/blob/r1.4/tensorflow/python/…Elanaeland
@RajarsheeMitra can this be done for an entire matrix? meaning cosine distance between a vector v and all the rows in a matrix assuming each row is a vector of same dimension as v ?Benavidez
@Benavidez Yes.Alvera
@Elanaeland 1-s is needed. math_ops.reduce_sum(radial_diffs, axis=(dim,), keep_dims=True) is cosine similarity.Chavira
N
2

Cosine similarity between a and b :

tf.keras.losses.CosineSimilarity()(a, b)
Noblesse answered 27/10, 2022 at 14:25 Comment(1)
Tip: This converges to -1.0 since it's meant to be used a as loss function. If you want something that converges to 1.0 use tf.keras.metrics.CosineSimilarity instead.Anisometropia
L
1

You can normalize you vector or matrix like this:

[batch_size*hidden_num]
states_norm=tf.nn.l2_normalize(states,dim=1)
[batch_size * embedding_dims]
embedding_norm=tf.nn.l2_normalize(embedding,dim=1)
#assert hidden_num == embbeding_dims
after mat [batch_size*embedding]
user_app_scores = tf.matmul(states_norm,embedding_norm,transpose_b=True)
Lunseth answered 9/3, 2018 at 11:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.