
🪼 a python library for doing approximate and phonetic matching of strings.

Primary LanguageJupyter NotebookMIT LicenseMIT


jellyfish is a library for approximate & phonetic matching of strings.

Source: https://github.com/jamesturk/jellyfish

Documentation: https://jamesturk.github.io/jellyfish/

Issues: https://github.com/jamesturk/jellyfish/issues

PyPI badge Test badge Coveralls Test Rust

Included Algorithms

String comparison:

  • Levenshtein Distance
  • Damerau-Levenshtein Distance
  • Jaccard Index
  • Jaro Distance
  • Jaro-Winkler Distance
  • Match Rating Approach Comparison
  • Hamming Distance

Phonetic encoding:

  • American Soundex
  • Metaphone
  • NYSIIS (New York State Identification and Intelligence System)
  • Match Rating Codex

Example Usage

>>> import jellyfish
>>> jellyfish.levenshtein_distance('jellyfish', 'smellyfish')
>>> jellyfish.jaro_similarity('jellyfish', 'smellyfish')
>>> jellyfish.damerau_levenshtein_distance('jellyfish', 'jellyfihs')

>>> jellyfish.metaphone('Jellyfish')
>>> jellyfish.soundex('Jellyfish')
>>> jellyfish.nysiis('Jellyfish')
>>> jellyfish.match_rating_codex('Jellyfish')