seatgeek/fuzzywuzzy

process.extract broken in fuzzywuzzy=0.13

spirit1317 opened this issue · 3 comments

From version 0.13 onward, theres a mismatch between process.extract(scorer=fuzz.ratio) scores and fuzz.ratio.

#fuzzywuzzy==0.12
from fuzzywuzzy import process, fuzz

process.extract('OdCeny', ['producent'], scorer=fuzz.ratio)
fuzz.ratio('producent', 'OdCeny')

prints:

[('producent', 40)]
40

But

#fuzzywuzzy==0.13
from fuzzywuzzy import process, fuzz

process.extract('OdCeny', ['producent'], scorer=fuzz.ratio)
fuzz.ratio('producent', 'OdCeny')
#

prints:

[('producent', 67)]
40

Please let me know if this is a feature or bug.

Also, if you change the order it will give a different score:

process.extract('OdCeny', ['producent'], scorer=fuzz.ratio)
#[('OdCeny', 40)]

process.extract('producent', ['OdCeny'], scorer=fuzz.ratio)
#[('OdCeny', 67)]

#288 (comment) @spirit1317 see this comment

Use:

process.extract('producent', ['OdCeny'], scorer=fuzz.ratio, processor=None)
#[('OdCeny', 40)]

to get the same results