seatgeek/fuzzywuzzy

How to decrease False positive matches? (process.extract / WRatio)

Pranav082001 opened this issue · 3 comments

I am using process.extract method, And I know it uses WRatio under the hood for calculating score. Following is the case in which I am getting very high score of 90 despite the string hardly equal. Is there any way to fix this in WRatio?

inp_name="america"

name_list=["american Futures and Options Exchange"]
        
process.extractOne(inp_name,name_list)

Output--> ('american Futures and Options Exchange', 90.0, 0)

PS: I know other alternatives likes fuzz.ratio, partial_ratio, token_sort_ratio. But WRatio works pretty well for my usecase. So any workaround for the same would be appreciated... Thanks!

Maybe write your own version of WRatio, which does not fall back to the partial version of the algorithms.

Could you please help me. Do I need to set try_partial parameter False in def WRatio?

try_partial = True

Yes thats what I would try