cosine-similarity Questions
1
Solved
I noticed that both scipy and sklearn have a cosine similarity/cosine distance functions. I wanted to test the speed for each on pairs of vectors:
setup1 = "import numpy as np; arrs1 = [np.ran...
Roxannaroxanne asked 28/4, 2020 at 21:34
1
I have text column in df1 and text column in df2. The length of df2 will be different to that of length of df1.
I want to calculare cosine similarity for every entry in df1[text] against every ent...
Wateriness asked 31/12, 2019 at 10:15
2
Solved
I would like to apply fine-tuning Bert to calculate semantic similarity between sentences.
I search a lot websites, but I almost not found downstream about this.
I just found STS benchmark.
I won...
Adrianeadrianna asked 4/12, 2019 at 9:18
1
Solved
It looks like scipy.spatial.distance.cdist cosine similariy distance:
link to cos distance 1
1 - u*v/(||u||||v||)
is different from
sklearn.metrics.pairwise.cosine_similarity which is
link t...
Mali asked 14/10, 2019 at 16:56
2
Solved
I am interested in calculating similarity between vectors, however this similarity has to be a number between 0 and 1. There are many questions concerning tf-idf and cosine similarity, all indicati...
Sensory asked 26/5, 2019 at 19:53
2
Solved
I have a table of images with sentence captions. Given a new sentence I want to find the images that best match it based on how close the new sentence is to the stored old sentences.
I know that I...
Ryle asked 5/1, 2016 at 3:29
4
Solved
My goal is to input 3 queries and find out which query is most similar to a set of 5 documents.
So far I have calculated the tf-idf of the documents doing the following:
from sklearn.feature_extr...
Isodynamic asked 14/4, 2019 at 16:6
1
Solved
I want to calculate the similarity between lists of words, for example :
import math,re
from collections import Counter
test = ['address','ip']
list_a = ['identifiant', 'ip', 'address', 'fixe', '...
Rupee asked 28/3, 2019 at 13:35
1
According to several posts I found on stackoverflow (for instance this Why does word2Vec use cosine similarity?), it's common practice to calculate the cosine similarity between two word vectors af...
Convoke asked 28/1, 2019 at 22:10
1
Solved
I am working on my first major data science project. I am attempting to match names between a large list of data from one source, to a cleansed dictionary in another. I am using this string matchin...
Kristiekristien asked 18/12, 2018 at 6:14
2
Solved
Cosine similarity between two equally-sized vectors (of reals) is defined as the dot product divided by the product of the norms.
To represent vectors, I have a large table of float arrays, e.g. C...
Maurinemaurise asked 28/6, 2017 at 1:13
0
I have pre-made database full of 512 dimensional vectors and want to implement an efficient searching algorithm over them.
Research
Cosine similarity:
The best algorithm in this case would con...
Bilow asked 28/10, 2018 at 10:56
2
Solved
Basically given some vector v, I want to get another random vector w with some cosine similarity between v and w. Is there any way we can get this in python?
Example: for simplicity I will have 2D...
Heavyset asked 21/10, 2018 at 15:5
2
Solved
I was reading the paper "Improving Distributional Similarity
with Lessons Learned from Word Embeddings" by Levy et al., and while discussing their hyperparameters, they say:
Vector Normalizatio...
Dichotomous asked 11/7, 2018 at 17:10
1
I gather Text documents (in Node.js) where one document i is represented as a list of words.
What is an efficient way to compute the similarity between these documents, taking into account that new...
Superdreadnought asked 21/12, 2012 at 8:17
3
Solved
I have a small problem to perform TSNE on my dataset, using cosine similarity.
I have calculated the cosine similarity of all of my vectors, so I have a square matrix which contains my cosine sim...
Wilkerson asked 11/4, 2016 at 9:58
2
Solved
I want to calculate cosine similarity between different rows of a matrix in matlab. I wrote the following code in matlab:
for i = 1:n_row
for j = i:n_row
S2(i,j) = dot(S1(i,:), S1(j,:)) / (norm_...
Brewmaster asked 4/1, 2018 at 18:36
1
Solved
I have two files: A and B
A has 400,000 lines each having 50 float values
B has 40,000 lines having 50 float values.
For every line in B, I need to find corresponding lines in A which have >90% ...
Greasepaint asked 4/12, 2017 at 1:52
1
Solved
I have to compute a cosine distance between each rows but I have no idea how to do it using Spark API Dataframes elegantly. The idea is to compute similarities for each rows(items) and take top 10 ...
Polytonality asked 10/10, 2017 at 9:53
1
I'm new to Apache Spark, want to find the similar text from a bunch of text, have tried myself as follows -
I have 2 RDD-
1st RDD contain incomplete text as follows -
[0,541 Suite 204, Redwood C...
Dorella asked 18/9, 2015 at 6:28
3
The code below causes my system to run out of memory before it completes.
Can you suggest a more efficient means of computing the cosine similarity on a large matrix, such as the one below?
I wo...
Unrighteous asked 1/12, 2016 at 0:22
2
Solved
I have defined two matrices like following:
from scipy import linalg, mat, dot
a = mat([-0.711,0.730])
b = mat([-1.099,0.124])
Now, I want to calculate the cosine similarity of these two matrice...
Electrosurgery asked 24/2, 2014 at 6:42
1
For a Recommender System, I need to compute the cosine similarity between all the columns of a whole Spark DataFrame.
In Pandas I used to do this:
import sklearn.metrics as metrics
import pandas ...
Beeck asked 11/5, 2017 at 17:2
2
Solved
I'm trying to compute the tf-idf vector cosine similarity between two columns in a Pandas dataframe. One column contains a search query, the other contains a product title. The cosine similarity va...
Efficacious asked 23/3, 2017 at 0:37
1
Solved
Suppose you have a table in a database constructed as follows:
create table data (v int, base int, w_td float);
insert into data values (99,1,4);
insert into data values (99,2,3);
insert into data...
Pique asked 18/2, 2017 at 3:9
© 2022 - 2024 — McMap. All rights reserved.