Alternate Algorithms from TextDistance
Opened this issue · 2 comments
BradKML commented
There are other algorithms that are available in TextDistance that may be considered.
Edit based
- MLIPNS http://www.sial.iias.spb.su/files/386-386-1-PB.pdf
- Strcmp95 http://cpansearch.perl.org/src/SCW/Text-JaroWinkler-0.1/strcmp95.c
- Needleman-Wunsch https://en.wikipedia.org/wiki/Needleman%E2%80%93Wunsch_algorithm
- Gotoh http://bioinfo.ict.ac.cn/~dbu/AlgorithmCourses/Lectures/LOA/Lec6-Sequence-Alignment-Affine-Gaps-Gotoh1982.pdf
- Smith-Waterman https://en.wikipedia.org/wiki/Smith%E2%80%93Waterman_algorithm
Token based
- Tversky index https://en.wikipedia.org/wiki/Tversky_index
- Tanimoto distance https://en.wikipedia.org/wiki/Jaccard_index#Tanimoto_similarity_and_distance
- Monge-Elkan https://www.academia.edu/200314/Generalized_Monge-Elkan_Method_for_Approximate_Text_String_Comparison
- Bag distance https://github.com/Yomguithereal/talisman/blob/master/src/metrics/bag.js
Alternate Method
- Ratcliff-Obershelp similarity https://en.wikipedia.org/wiki/Gestalt_Pattern_Matching
- Normalized compression distance (requires compression) https://en.wikipedia.org/wiki/Normalized_compression_distance#Normalized_compression_distance
Major Source of Info: https://github.com/life4/textdistance
BradKML commented
BradKML commented