Fuzzy Matching with threshold filter C#
Asked Answered
S

2

10

I need to implement some kind of this:

string textToSearch = "Extreme Golf: The Showdown";
string textToSearchFor = "Golf Extreme Showdown";
int fuzzyMatchScoreThreshold = 80; // One a 0 to 100 scale
bool searchSuccessful = IsFuzzyMatch(textToSearch, textToSearchFor, fuzzyMatchScoreThreshold);
if (searchSuccessful == true)
{
    -- we have a match.
}

Here's the function stub written in C#:

public bool IsFuzzyMatch (string textToSearch, string textToSearchFor, int fuzzyMatchScoreThreshold)
{
   bool isMatch = false;
   // do fuzzy logic here and set isMatch to true if successful match.
   return isMatch;
}

But I have no any idea how to implement logic in IsFuzzyMatch method. Any ideas? Perhaps there is a ready-made solution for this purpose?

Sweatband answered 3/11, 2010 at 11:11 Comment(2)
You could calculate the Levenshtein distance, using words as symbols instead of characters, where words are considered equal based on their Levenshtein distance. There are many SO topics on the Levenshtein distance.Ruffo
See https://mcmap.net/q/76149/-similar-string-algorithm/…Coracorabel
P
9

I like a combination of Dice Coeffiecient, Levenshtein Distance, Longest Common Subsequence, and at times the Double Metaphone. The first three will provide you a threshold value. I prefer to combine them in some way. YMMV.

I've just posted a blog post that has a C# implementation for each of these called Four Functions for Finding Fuzzy String Matches in C# Extensions.

Pollinize answered 28/5, 2011 at 1:37 Comment(0)
B
1

You need Levenshtein Distance Algorithm for find how to go from one string to another by operations insert, delete and modify. You fuzzyMatchScoreThreshold is a Levenshtein Distance divided to length of the string in simple way.

Bogle answered 3/11, 2010 at 11:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.