/java-string-similarity

A Java library that implements several algorithms that calculate similarity between strings.

Primary LanguageJavaMIT LicenseMIT

java-string-similarity that calculates a normalized distance or similarity score between two strings. A score of 0.0 means that the two strings are absolutely dissimilar, and 1.0 means that absolutely similar (or equal). Anything in between indicates how similar each the two strings are.

Example

In this simple example, we want to calculate a similarity score between the words McDonalds and MacMahons. We are selecting the Jaro-Winkler distance algorithm algorithm.

SimilarityStrategy strategy = new JaroWinklerStrategy();
String target = "McDonalds";
String source = "MacMahons";
StringSimilarityService service = new StringSimilarityServiceImpl(strategy);
double score = service.score(source, target); // Score is 0.90

Algorithms

Installation

This project currently uses Maven for management. You can compile, test and install the component to your local repo by calling:

mvn install

Then, you can add this component to your project by adding a dependency:

<dependency>
    <groupId>net.ricecode</groupId>
	<artifactId>string-similarity</artifactId>
	<version>1.0.0</version>
</dependency>

TODO