How to compute cosine similarity using two matrices
Asked Answered
U

2

6

I have two matrices, A (dimensions M x N) and B (N x P). In fact, they are collections of vectors - row vectors in A, column vectors in B. I want to get cosine similarity scores for every pair a and b, where a is a vector (row) from matrix A and b is a vector (column) from matrix B.

I have started by multiplying the matrices, which results in matrix C (dimensions M x P).

C = A*B

However, to obtain cosine similarity scores, I need to divide each value C(i,j) by the norm of the two corresponding vectors. Could you suggest the easiest way to do this in Matlab?

Unspoiled answered 15/1, 2013 at 14:50 Comment(2)
How about octave.sourceforge.net/statistics/function/pdist.html ?Unnecessarily
I took advantage of octave.sourceforge.io/statistics/function/pdist2.html to solve the same problemEpigraphy
L
5

The simplest solution would be computing the norms first using element-wise multiplication and summation along the desired dimensions:

normA = sqrt(sum(A .^ 2, 2));
normB = sqrt(sum(B .^ 2, 1));

normA and normB are now a column vector and row vector, respectively. To divide corresponding elements in A * B by normA and normB, use bsxfun like so:

C = bsxfun(@rdivide, bsxfun(@rdivide, A * B, normA), normB);
Lectionary answered 15/1, 2013 at 14:59 Comment(0)
P
0

You can use scipy to compute it very easily.

from scipy.spatial import distance
cosine_sim  =  1 - sp.distance.cdist(A, B, 'cosine')

All you need to do is pass your 2D matrices in above formula and spicy will return you numpy array.

Refer doc here: https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.cdist.html

Prototherian answered 4/6, 2022 at 20:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.