PercolatorMatching by Jim_Varney

<PackageReference Include="PercolatorMatching" Version="1.1.0" />

 PercolatorMatching 1.1.0

A simple dll that contains a matching class to match strings and to calculate the score of similarity between the two strings using the Ratcliff-Obershelp algorithm.

<package xmlns="http://schemas.microsoft.com/packaging/2011/08/nuspec.xsd">
  <metadata>
    <id>PercolatorMatching</id>
    <version>1.1</version>
    <title>Percolator Matching</title>
    <authors>Jim_Varney</authors>
    <owners>CoopDigity</owners>
    <licenseUrl>http://opensource.org/licenses/MIT</licenseUrl>
    <projectUrl>http://www.coopdigity.com/</projectUrl>
    <iconUrl>http://www.coopdigity.com/wp-content/uploads/2014/04/logo3.png</iconUrl>
    <requireLicenseAcceptance>true</requireLicenseAcceptance>
    <description>A simple dll that contains a matching class to match strings and to calculate the score of similarity between the two strings using the Ratcliff-Obershelp algorithm.</description>
    <summary>A simple dll that contains a matching class to match strings and to calculate the score of similarity between the two strings using the Ratcliff-Obershelp algorithm.</summary>
    <releaseNotes>I originally built this when I found out that the fuzzy lookup and fuzzy grouping components of SSIS were only available on enterprise editions of SQL server.  I've used this to scan over database tables to search for possible duplicate entries, and output the results to another table for a user to look over at a later time, and other applications as well.

Reference the dll and expose the namespace "Percolator.Matching". Make a new instance of "Fuzzylator."

The "ThresholdPercentage" is the threshold that the two strings must meet in order to be deemed as similar. This can be set while creating the new object, or later. If no threshold is set, then it will default to the "Zero" percent.

There are several overloads of the "IsSimilar" method to accomodate a couple different scenarios.
--Durring every check a score is calculated. The optional out parameter can be used to grab that score out of the check if he or she wishes to use it later rather than having to calculate the same score later on. --An optional ThresholdPercentage can be used on a single method to use that percentage rather than the one set by the instance for that one method call.

The "IsUPCSimilar" is a specialized UPC scanner that is streamlined specifically for a upc string. It does not calculate longest common subsequences, rather just looks at each digit in order and returns the score.

"GetScore" returns the score between the two strings, using the Ratcliff/Obershelp algorithm.

"GetUPCScore" again is a streamlined algorithm specifically for a UPC string.

Examples =&gt;

using the similarty bools:

var fuz = new Fuzzylator(ThresholdPercentage.Eighty);

string str1 = "Test String"; 
string str2 = "A Test String";

if (fuz.IsSimilar(str1, str2)) 
{ 
//Do something 
}

double score; if (fuz.IsSimilar(str1, str2, out score)) 
{ 
//Do something 
Console.WriteLine(score); //score now contains the score of the two strings 
}

if (fuz.IsSimilar(str1, str2, ThresholdPercentage.Ninety)) 
{ 
//Do something 
//The IsSimilar check uses a Ninety percent threshold for this one time. 
}

double score = fuz.GetScore(str1, str2, true); //the score variable now holds the value of the score between str1 and str2, optionally ignoring the case.</releaseNotes>
    <copyright>CoopDigity</copyright>
    <tags>matching fuzzy similarity mini pattern percolator</tags>
  </metadata>
</package>