stringdist Questions

3

Solved

I have a problem concerning very fast and efficient comparison between the substrings of two strings in my dataset, which won't run fast enough despite pretty powerful machinery. I have a data.tabl...
Tyburn asked 10/10, 2023 at 9:51

4

Solved

With the function stringdist, I can calculate the Levenshtein distance between strings : it counts the number of deletions, insertions and substitutions necessary to turn a string into another. For...
Extrude asked 30/6, 2019 at 20:17

3

I would like to calculate the Jaro-Winkler string distance in a database. If I bring the data into R (with collect) I can easily use the stringdist function from the stringdist package. But my dat...
Roberts asked 2/6, 2018 at 22:57

2

Solved

I have a 400,000 row file with manually entered addresses which need to be geocoded. There's a lot of different variations of the same addresses in the file, so it seems wasteful to be using API ca...
Musil asked 10/9, 2020 at 19:26

4

Solved

Is there a package that contains Levenshtein distance counting function which is implemented as a C or Fortran code? I have many strings to compare and stringMatch from MiscPsycho is too slow for t...
Prosthesis asked 5/7, 2010 at 20:50

1

Solved

I was answering these two questions and got an adequate solution, but I had trouble passing arguments using fuzzy_join into the match_fun that I extracted from fuzzyjoin::stringdist_join. In this c...
Zasuwa asked 6/6, 2017 at 7:10

2

Solved

I have two large datasets, one around half a million records and the other one around 70K. These datasets have address. I want to match if any of the address in the smaller data set are present in ...
Randell asked 12/3, 2017 at 15:45

2

Solved

I discovered the excellent package "stringdist" and now want to use it to compute string distances. In particular I have a set of words, and I want to print out near-matches, where "near match" is ...
Vassell asked 18/7, 2015 at 1:34

0

Inspired by the experimental fuzzy_join function from the statar package I wrote a function myself which combines exact and fuzzy (by string distances) matching. The merging job I have to do is qui...
1

© 2022 - 2024 — McMap. All rights reserved.