similarity Questions

1

I'm trying to scrape headlines and body text from articles on a few specific sites, similar to what Google does with Google News. The problem is that across different sites, they may have articles ...
Lemay asked 5/4, 2010 at 18:52

2

Solved

Is there a string similarity measure available in Python+Sqlite, for example with the sqlite3 module? Example of use case: import sqlite3 conn = sqlite3.connect(':memory:') c = conn.cursor() c....
Kizzee asked 11/4, 2018 at 15:41

10

Solved

I'm working with a large database of businesses. I'd like to be able to compare two business names for similarity to see if they possibly might be duplicates. Below is a list of business names th...

2

I want to use something like difflib.get_close_matches but instead of the most similar strings, I would like to obtain the indexes (i.e. position in the list). The indexes of the list are more fl...
Albuminuria asked 14/6, 2018 at 15:38

6

Let's say I have 2 strings which is pretty similar. I want to find other string which is close to s1 and s2 in terms of Levenshtein distance. import Levenshtein s1 = 'aaabbbccc' s2 = 'abacbbbccde' ...
Pericarditis asked 18/3, 2021 at 8:1

3

Solved

I want to compute how similar two arbitrary sentences are to each other. For example: A mathematician found a solution to the problem. The problem was solved by a young mathematician. I c...
Rang asked 21/4, 2013 at 16:4

4

Solved

How can we measure the similarity distance between categorical data ? Example: Gender: Male, Female Numerical values: [0 - 100], [200 - 300] Strings: Professionals, beginners, etc,... Thanks in...
Weis asked 21/4, 2015 at 11:46

4

Solved

I have a need to match cold leads against a database of our clients. The leads come from a third party provider in bulk (thousands of records) and sales is asking us to (in their words) "filter o...
Snappish asked 29/7, 2010 at 19:57

2

I would like to use Word2Vec to check similarity of texts. I am currently using another logic: from fuzzywuzzy import fuzz def sim(name, dataset): matches = dataset.apply(lambda row: ((fuzz.ratio...
Priebe asked 22/1, 2021 at 21:0

1

Solved

I have been using Structural Similarity Index (through tensorflow) for comparing images, however it takes too long. I was wondering if there is an alternative technique that doesn't take so much ti...
Vraisemblance asked 24/12, 2020 at 15:29

7

Solved

I'm wondering if there is a built in function in R that can find the cosine similarity (or cosine distance) between two arrays? Currently, I implemented my own function, but I can't help but think...
Jacqualinejacquard asked 29/3, 2010 at 0:44

2

Solved

I do use the pg_trgm module in PostgreSQL to calculate similarity between two strings using trigrams. Particularly I use: similarity(text, text) Which returns returns a number that indicates how...
Sacaton asked 15/1, 2016 at 16:9

2

Solved

i want to set my own custom similarity in my solr schema.xml but i have a few problems with understanding this feature. I want to completely deactivate solr scoring (tf,idf,coord and fieldNorm). I...
Lucilius asked 6/12, 2013 at 16:21

4

Solved

I am trying to create a cosine similarity function and then display the results in a HTML element. I have written the following: function cosinesim(A,B){ var dotproduct=0; var mA=0; var ...
Sucrase asked 16/7, 2018 at 12:49

3

Solved

I know it's possible to return how similar two strings are by using the following function: from difflib import SequenceMatcher def similar(a, b): output=SequenceMatcher(None, a, b).ratio() retu...
Guessrope asked 14/3, 2016 at 6:32

3

This is an NLP problem and I was wondering how I should proceed. How difficult is the problem? Could I replace the word with synonyms and check that the grammar is correct?

2

I am trying to understand how similarity in Spacy works. I tried using Melania Trump's speech and Michelle Obama's speech to see how similar they were. This is my code. import spacy nlp = spacy...
Xenogenesis asked 23/11, 2018 at 22:35

2

Solved

I am interested in calculating similarity between vectors, however this similarity has to be a number between 0 and 1. There are many questions concerning tf-idf and cosine similarity, all indicati...
Sensory asked 26/5, 2019 at 19:53

1

I'm trying to find out the similarity between 2 documents. I'm using Doc2vec Gensim to train around 10k documents. There are around 10 string type of tags. Each tag consists of a unique word and co...
Schargel asked 27/5, 2019 at 9:34

3

i'm suffering in finding a good way to compare (measure) the similarity between two different signals. I do not want to find the time-delay of one signal to another, but I want to see how are they ...
Hurried asked 27/8, 2015 at 8:6

0

I have 100 matrices in which each row corresponds to an individual and column refers to sites. I want to sort the row by a measure of similarity such that the most similar individuals are next to e...

4

Solved

I'm trying to compute item-to-item similarity along the lines of Amazon's "Customers who viewed/purchased X have also viewed/purchased Y and Z". All of the examples and references I've seen are for...

1

Solved

I want to calculate the similarity between lists of words, for example : import math,re from collections import Counter test = ['address','ip'] list_a = ['identifiant', 'ip', 'address', 'fixe', '...
Rupee asked 28/3, 2019 at 13:35

2

There are some surprisingly good image compare tools which find similar image even if it's not exactly the same (eg. change in size, wallpaper, brightness/contrast). I have some example application...
Currie asked 3/8, 2014 at 14:22

1

Solved

One sentence backdrop: I have text data from auto-transcribed talks, and I want to compare their similarity of their content (e.g. what they are talking about) to do clustering and recommendation. ...
Archway asked 7/6, 2018 at 14:25

© 2022 - 2024 — McMap. All rights reserved.