How to calculate the Cosine similarity between two tensors?

Asked 11/4, 2017 at 23:18 Answered 27/10, 2022 at 14:25

I have two normalized tensors and I need to calculate the cosine similarity between these tensors. How do I do it with TensorFlow?

cosine(normalize_a,normalize_b)

    a = tf.placeholder(tf.float32, shape=[None], name="input_placeholder_a")
    b = tf.placeholder(tf.float32, shape=[None], name="input_placeholder_b")
    normalize_a = tf.nn.l2_normalize(a,0)        
    normalize_b = tf.nn.l2_normalize(b,0)

Thorwald answered 11/4, 2017 at 23:18 Comment(0)

This will do the job:

a = tf.placeholder(tf.float32, shape=[None], name="input_placeholder_a")
b = tf.placeholder(tf.float32, shape=[None], name="input_placeholder_b")
normalize_a = tf.nn.l2_normalize(a,0)        
normalize_b = tf.nn.l2_normalize(b,0)
cos_similarity=tf.reduce_sum(tf.multiply(normalize_a,normalize_b))
sess=tf.Session()
cos_sim=sess.run(cos_similarity,feed_dict={a:[1,2,3],b:[2,4,6]})

This prints 0.99999988

Emulation answered 12/4, 2017 at 1:24 Comment(4)

Thank you a lot for your answer. Is the cosine similarity formula getting simplified by normalizing the inputs first? your formula seems to have less things than the one from Wikipedia en.wikipedia.org/wiki/Cosine_similarity – Thorwald 12/4, 2017 at 4:10

If you will not normalize first, then after you calculate the inner product a*b you have to divide by the product of the norms of a and b. However, if you normalize in advance, you don't need to do that. This is because normalize_a=a/||a|| (and similarly for b). – Emulation 12/4, 2017 at 4:14

why not matmul? – Selmaselman 13/11, 2017 at 9:2

tf.matmul() is matrix multiplication, tf.multiply() is element-wise multiplication – Turpin 26/2, 2018 at 8:25

Times change. With the latest TF API, this can be computed by calling tf.losses.cosine_distance.

Example:

import tensorflow as tf
import numpy as np


x = tf.constant(np.random.uniform(-1, 1, 10)) 
y = tf.constant(np.random.uniform(-1, 1, 10))
s = tf.losses.cosine_distance(tf.nn.l2_normalize(x, 0), tf.nn.l2_normalize(y, 0), dim=0)
print(tf.Session().run(s))

Of course, 1 - s is the cosine similarity!

Alvera answered 5/9, 2017 at 14:35 Comment(6)

why is 1-s the cosine similarity? – Selmaselman 22/10, 2017 at 18:11

because s is the cosine distance, not the similarity. – Alvera 23/10, 2017 at 18:43

The 1-s isn't needed. The function is called distance, but returns similarity. I think because it's in tf.losses. Have a look at the code, i might be wrong. Line 274. losses = 1 - math_ops.reduce_sum(radial_diffs, axis=(dim,), keep_dims=True) github.com/tensorflow/tensorflow/blob/r1.4/tensorflow/python/… – Elanaeland 30/11, 2017 at 17:59

@RajarsheeMitra can this be done for an entire matrix? meaning cosine distance between a vector v and all the rows in a matrix assuming each row is a vector of same dimension as v ? – Benavidez 7/12, 2017 at 21:41

@Benavidez Yes. – Alvera 31/3, 2018 at 11:24

@Elanaeland 1-s is needed. math_ops.reduce_sum(radial_diffs, axis=(dim,), keep_dims=True) is cosine similarity. – Chavira 1/12, 2021 at 6:48

Cosine similarity between a and b :

tf.keras.losses.CosineSimilarity()(a, b)

Noblesse answered 27/10, 2022 at 14:25 Comment(1)

Tip: This converges to -1.0 since it's meant to be used a as loss function. If you want something that converges to 1.0 use tf.keras.metrics.CosineSimilarity instead. – Anisometropia 3/8, 2023 at 11:29

You can normalize you vector or matrix like this:

[batch_size*hidden_num]
states_norm=tf.nn.l2_normalize(states,dim=1)
[batch_size * embedding_dims]
embedding_norm=tf.nn.l2_normalize(embedding,dim=1)
#assert hidden_num == embbeding_dims
after mat [batch_size*embedding]
user_app_scores = tf.matmul(states_norm,embedding_norm,transpose_b=True)

Lunseth answered 9/3, 2018 at 11:40 Comment(0)

Recommended topics

Hot tags