cosine-similarity Questions

1

Solved

I noticed that both scipy and sklearn have a cosine similarity/cosine distance functions. I wanted to test the speed for each on pairs of vectors: setup1 = "import numpy as np; arrs1 = [np.ran...
Roxannaroxanne asked 28/4, 2020 at 21:34

1

I have text column in df1 and text column in df2. The length of df2 will be different to that of length of df1. I want to calculare cosine similarity for every entry in df1[text] against every ent...
Wateriness asked 31/12, 2019 at 10:15

2

Solved

I would like to apply fine-tuning Bert to calculate semantic similarity between sentences. I search a lot websites, but I almost not found downstream about this. I just found STS benchmark. I won...
Adrianeadrianna asked 4/12, 2019 at 9:18

1

Solved

It looks like scipy.spatial.distance.cdist cosine similariy distance: link to cos distance 1 1 - u*v/(||u||||v||) is different from sklearn.metrics.pairwise.cosine_similarity which is link t...

2

Solved

I am interested in calculating similarity between vectors, however this similarity has to be a number between 0 and 1. There are many questions concerning tf-idf and cosine similarity, all indicati...
Sensory asked 26/5, 2019 at 19:53

2

Solved

I have a table of images with sentence captions. Given a new sentence I want to find the images that best match it based on how close the new sentence is to the stored old sentences. I know that I...

4

Solved

My goal is to input 3 queries and find out which query is most similar to a set of 5 documents. So far I have calculated the tf-idf of the documents doing the following: from sklearn.feature_extr...
Isodynamic asked 14/4, 2019 at 16:6

1

Solved

I want to calculate the similarity between lists of words, for example : import math,re from collections import Counter test = ['address','ip'] list_a = ['identifiant', 'ip', 'address', 'fixe', '...
Rupee asked 28/3, 2019 at 13:35

1

According to several posts I found on stackoverflow (for instance this Why does word2Vec use cosine similarity?), it's common practice to calculate the cosine similarity between two word vectors af...
Convoke asked 28/1, 2019 at 22:10

1

Solved

I am working on my first major data science project. I am attempting to match names between a large list of data from one source, to a cleansed dictionary in another. I am using this string matchin...
Kristiekristien asked 18/12, 2018 at 6:14

2

Solved

Cosine similarity between two equally-sized vectors (of reals) is defined as the dot product divided by the product of the norms. To represent vectors, I have a large table of float arrays, e.g. C...
Maurinemaurise asked 28/6, 2017 at 1:13

0

I have pre-made database full of 512 dimensional vectors and want to implement an efficient searching algorithm over them. Research Cosine similarity: The best algorithm in this case would con...
Bilow asked 28/10, 2018 at 10:56

2

Solved

Basically given some vector v, I want to get another random vector w with some cosine similarity between v and w. Is there any way we can get this in python? Example: for simplicity I will have 2D...
Heavyset asked 21/10, 2018 at 15:5

2

Solved

I was reading the paper "Improving Distributional Similarity with Lessons Learned from Word Embeddings" by Levy et al., and while discussing their hyperparameters, they say: Vector Normalizatio...
Dichotomous asked 11/7, 2018 at 17:10

1

I gather Text documents (in Node.js) where one document i is represented as a list of words. What is an efficient way to compute the similarity between these documents, taking into account that new...
Superdreadnought asked 21/12, 2012 at 8:17

3

Solved

I have a small problem to perform TSNE on my dataset, using cosine similarity. I have calculated the cosine similarity of all of my vectors, so I have a square matrix which contains my cosine sim...
Wilkerson asked 11/4, 2016 at 9:58

2

Solved

I want to calculate cosine similarity between different rows of a matrix in matlab. I wrote the following code in matlab: for i = 1:n_row for j = i:n_row S2(i,j) = dot(S1(i,:), S1(j,:)) / (norm_...
Brewmaster asked 4/1, 2018 at 18:36

1

Solved

I have two files: A and B A has 400,000 lines each having 50 float values B has 40,000 lines having 50 float values. For every line in B, I need to find corresponding lines in A which have >90% ...

1

Solved

I have to compute a cosine distance between each rows but I have no idea how to do it using Spark API Dataframes elegantly. The idea is to compute similarities for each rows(items) and take top 10 ...
Polytonality asked 10/10, 2017 at 9:53

1

I'm new to Apache Spark, want to find the similar text from a bunch of text, have tried myself as follows - I have 2 RDD- 1st RDD contain incomplete text as follows - [0,541 Suite 204, Redwood C...
Dorella asked 18/9, 2015 at 6:28

3

The code below causes my system to run out of memory before it completes. Can you suggest a more efficient means of computing the cosine similarity on a large matrix, such as the one below? I wo...
Unrighteous asked 1/12, 2016 at 0:22

2

Solved

I have defined two matrices like following: from scipy import linalg, mat, dot a = mat([-0.711,0.730]) b = mat([-1.099,0.124]) Now, I want to calculate the cosine similarity of these two matrice...
Electrosurgery asked 24/2, 2014 at 6:42

1

For a Recommender System, I need to compute the cosine similarity between all the columns of a whole Spark DataFrame. In Pandas I used to do this: import sklearn.metrics as metrics import pandas ...

2

Solved

I'm trying to compute the tf-idf vector cosine similarity between two columns in a Pandas dataframe. One column contains a search query, the other contains a product title. The cosine similarity va...
Efficacious asked 23/3, 2017 at 0:37

1

Solved

Suppose you have a table in a database constructed as follows: create table data (v int, base int, w_td float); insert into data values (99,1,4); insert into data values (99,2,3); insert into data...
Pique asked 18/2, 2017 at 3:9

© 2022 - 2024 — McMap. All rights reserved.