Incorrect levenshtein distance for completely edited strings
Closed this issue · 1 comments
When I calculate the ratio of "abcdef" - "fedcba" , it results in 17, even though I expected 0.
The ratio calculation is as I understand it: r = ( 1 - d/L)*100 ,
with d being the Levenshtein distance and L the sum of the two compared strings.
In this library the levenshtein distance is valued with 1 for each insert/delete and 2 for each replace.
The levenshtein distance in this library, for these two strings should be 12 (2 for each replace), resulting in a ratio = (1 - 12/12)*100 = 0
However, in your library, the ratio results in 17, instead of 0. This is because the distance it calculates is 10 instead of 12, resulting in (1-10/12)*100=17 .
This seems to be the case for string of any length, whith 100% replacements, as if 1 replacement is missed.
Never mind, it naturally calculates the minimum distance:
a b c d e f
f e d c b a
1 2 2 0 2 2 1 = 10
It does calculate a distance of 12 in abcdef -> uvwxyz . .
Can close this 'issue'