Christopher-Thornton/hmni

fuzzymerge throws error if match cannot be found

Opened this issue · 0 comments

Modified example from the readme.md:

import pandas as pd
df1 = pd.DataFrame({'name': ['Al', 'Mark', 'James', 'Harold', 'Leon']}) # added name 'Leon'
df2 = pd.DataFrame({'name': ['Mark', 'Alan', 'James', 'Harold']})
merged = matcher.fuzzymerge(df1, df2, how='left', on='name')

This throws an error. The root cause seems to be that no match is found for 'Leon'. Indeed, setting threshold=0.4 runs without an error, since 'Leon' is matched to 'Alan' with similarity 0.43.