An Implementation of Jaro Distance Algorithm by Matthew A. Jaro
De-duplicate short strings such as names by computing similarity and distance between a pair of strings using wink-jaro-distance
. It is a part of wink — a growing family of high quality packages for Statistical Analysis, Natural Language Processing and Machine Learning in NodeJS.
It is an implementation of Jaro Distance Algorithm that determines the similarity/distance by taking into account the insertions, deletions and transpositions.
Use npm to install:
npm install wink-jaro-distance --save
// Load Jaro Distance Function
var jaro = require( 'wink-jaro-distance' );
console.log( jaro( 'father', 'farther') );
// -> { distance: 0.04761904761904756, similarity: 0.9523809523809524 }
console.log( jaro( 'Angelina', 'Angelica') );
// -> { distance: 0.08333333333333337, similarity: 0.9166666666666666 }
console.log( jaro( 'Flikr', 'Flicker' ) );
// -> { distance: 0.09523809523809523, similarity: 0.9047619047619048 }
console.log( jaro( 'abcdef', 'fedcba' ) );
// -> { distance: 0.6111111111111112, similarity: 0.38888888888888884 }
Computes Jaro distance and similarity between strings s1
and s2
.
Original Reference: UNIMATCH: A Record Linkage System: Users Manual pp 104.
Parameters
Examples
// returns { distance: 0.08333333333333337, similarity: 0.9166666666666666 }
jaro( 'daniel', 'danielle' );
Returns object containing distance
and similarity
values between 0 and 1.
If you spot a bug and the same has not yet been reported, raise a new issue or consider fixing it and sending a pull request.
wink-jaro-distance is copyright 2017 GRAYPE Systems Private Limited.
It is licensed under the under the terms of the GNU Affero General Public License as published by the Free Software Foundation, version 3 of the License.