/fuzzystring

Approximate String Comparision in C#

Primary LanguageC#Eclipse Public License 1.0EPL-1.0

fuzzystring

Approximate String Comparision in C#

Originally Hosted on Codplex http://fuzzystring.codeplex.com

Project Description

FuzzyString is a library developed for use in my day job for reconciling naming conventions between different models of the electric grid. I have stripped off the power system specific code and put together what can effectively be used as a string extension for determining approximate equality between two strings. All of the algorithms used here have been pulled from online resources, translated into C#, and compiled into this library. I found several other similar open-source implementations around but nothing for .NET/C#. Adding the *.dll to your project will give you access to this extension and the individual extensions under the hood of the ApproximatelyEquals() extension.

Algorithms included in this project

Approximate String Comparision

Note: This sample is taken from the legacy documentation on CodePlex.

While all of the algorithms are exposed and can be used and can provide their raw results, they have been conveniently combined in a way that they can selectively be used to judge the approximate equality of two strings. This is done through the ApproximatelyEquals extension and by setting the desired FuzzyStringComparisonOptions and FuzzyStringComparisonTolerance.

For two strings that are desired to be compared approximately, a boolean response of equality can be garnered in the following way:

string source = "kevin";
string target = "kevyn";

List<FuzzyStringComparisonOptions> options = new List<FuzzyStringComparisonOptions>();

// Choose which algorithms should weigh in for the comparison
options.Add(FuzzyStringComparisonOptions.UseOverlapCoefficient);
options.Add(FuzzyStringComparisonOptions.UseLongestCommonSubsequence);
options.Add(FuzzyStringComparisonOptions.UseLongestCommonSubstring);

// Choose the relative strength of the comparison - is it almost exactly equal? or is it just close?
FuzzyStringComparisonTolerance tolerance = FuzzyStringTolerance.Strong;

// Get a boolean determination of approximate equality
bool result = source.ApproximatelyEquals(target, options, tolerance);