record-linkage Questions
3
Solved
How can use fuzzy matching in pandas to detect duplicate rows (efficiently)
How to find duplicates of one column vs. all the other ones without a gigantic for loop of converting row_i toString(...
Antilogarithm asked 14/9, 2016 at 12:13
2
Solved
I'm reasonably new to machine learning, I've done a few projects in python. I'm looking for advice on how to approach the below problem which I believe could be automated.
A user in a data quality...
Haymo asked 16/2, 2017 at 16:40
1
Solved
Let's say that I have an MDM system (Master Data Management), whose primary application is to detect and prevent duplication of records.
Every time a sales rep enters a new customer in the system...
Swelter asked 12/4, 2017 at 10:16
1
My team has been stuck with running a fuzzy logic algorithm on a two large datasets.
The first (subset) is about 180K rows contains names, addresses, and emails for the people that we need to match...
Rieger asked 13/4, 2015 at 18:55
2
I have a question that is somewhat high level, so I'll try to be as specific as possible.
I'm doing a lot of research that involves combining disparate data sets with header information that refer...
Garniture asked 8/3, 2011 at 19:55
6
Solved
I have a large database (potentially in the millions of records) with relatively short strings of text (on the order of street address, names, etc).
I am looking for a strategy to remove inexact d...
Edva asked 25/8, 2011 at 19:25
2
Solved
I'm trying to use the Dedupe package to merge a small messy data to a canonical table. Since the canonical table is very large (122 million rows), I can't load it all into memory.
The current appr...
Reuter asked 15/7, 2015 at 18:9
2
Solved
I have the following problem and was thinking I could use machine learning but I'm not completely certain it will work for my use case.
I have a data set of around a hundred million records contai...
Baseline asked 5/5, 2013 at 3:36
3
Solved
I'm developing an application which must be able to find & merge duplicates in a Hundreds of thousands of contact information stored in sql server DB. I have to compare all the columns in the t...
Louise asked 4/10, 2013 at 11:54
1
© 2022 - 2024 — McMap. All rights reserved.