dhicks/comp-HOPOS

initialed names

Closed this issue · 2 comments

In production, 04_ produces false negatives with initialed names. Consider this chunk of 04_names_verif.csv:

Campbell,A. H.,Campbell,A H
Campbell,Andrew,Campbell,Andrew
Campbell,C. A.,Campbell,Charles A
Campbell,Charles A.,Campbell,Charles A
Campbell,D'Ann,Campbell,Dann
Campbell,D.,Campbell,A H
Campbell,Debra,Campbell,Debra
Campbell,Donald,Campbell,Donald T
Campbell,Donald T.,Campbell,Donald T
Campbell,Douglas,Campbell,Douglas I
Campbell,Douglas I.,Campbell,Douglas I

However, on a test set with these original names, 04_ does not produce these false negatives.

Test input and corresponding output
test.zip

Resolved in 31603f2