The JDNormalizer is a Java library that provides functionality to normalize job titles. Given a list of ideal (normalized) job titles, it can find the best match for an input job title using the Levenshtein Distance algorithm implemented on apache commons text.
- Normalizes input job titles to a list of predefined job titles.
- Uses the Levenshtein Distance algorithm to find the closest match.
- Includes unit test for ensuring functionality.
- Java 21 or later
- Maven (for building and managing dependencies)
git clone https://github.com/cpereiramt/JDNormalizer.git
cd JDNormalizer
Run the command below to build the project and generate jacoco report on target folder, the jacoco html report will be generated in the target/site/jacoco directory
mvn clean package
mvn test
You can use the generated JAR file as a dependency in your own projects.
- Copy the generated JAR file from the target directory to a libs directory in your project.
- Add the JAR file as a dependency in your pom.xml (if using Maven):
<dependency>
<groupId>com.claytonpereira</groupId>
<artifactId>JDNormalizer</artifactId>
<version><project-version></version>
<scope>system</scope>
<systemPath>${project.basedir}/libs/JDNormalizer-<project-version>.jar</systemPath>
</dependency>
public static void main(String[] args) {
Normalizer normalizer = new Normalizer();
String[] jobTitles = {"Java engineer", "C# engineer", "Chief Accountant"};
for (String jt : jobTitles) {
System.out.println("Input: " + jt + " => Normalized: " + normalizer.normalize(jt));
}
}