在NuGet安装F23.StringSimilarity,该库目前实现了十多种算法,根据自己的需求选取合适自己业务的算法,每一种算法都有自己的优缺点,建议把每一种算法大致弄明白,方便自己选择使用哪种,可以根据该插件中每个算法的命名去搜索了解。
NuGet安装F23.StringSimilarity的开源地址 https://github.com/feature23/StringSimilarity.NET
以下是比较调用比较两个字符串的方法,不同方法计算结果不一样,根据自己需求选择合适的算法
var str1 ="我是罗分明,我的个人主页www.luofenming.com"; var str2 ="罗分明是我,我的个人主页www.luofenming.com"; var jaroWinkler = new JaroWinkler(); var a = jaroWinkler.Similarity(str1, str2); Console.WriteLine("a:" + a); var normalizedLevenshtein = new NormalizedLevenshtein(); var b = normalizedLevenshtein.Similarity(str1, str2); Console.WriteLine("b:" + b); var cosine = new Cosine(); var c = cosine.Similarity(str1, str2); Console.WriteLine("c:" + c); var jaccard = new Jaccard(); var d = jaccard.Similarity(str1, str2); Console.WriteLine("d:" + d); var sorensenDice = new SorensenDice(); var e = sorensenDice.Similarity(str1, str2); Console.WriteLine("e:" + e); var ratcliffObershelp = new RatcliffObershelp(); var f = ratcliffObershelp.Similarity(str1, str2); Console.WriteLine("f:" + f); var longestCommonSubsequence = new LongestCommonSubsequence(); var g = longestCommonSubsequence.Distance(str1, str2); Console.WriteLine("g:" + g);
运行上面代码计算出的结果
a:0.977777779102325 b:0.866666666666667 c:0.857142857142857 d:0.75 e:0.857142857142857 f:0.933333333333333 g:4
本文来自 www.LuoFenMing.com