Fuzzy detection / preprocessing of institution names
vmenger opened this issue · 0 comments
vmenger commented
From #115, Deduce includes a long list of healthcare institutions. There is however a mismatch between the names of the institutions on the list, and the actual name that is written in text. For example, an institution like 'De Binnentuin, zorgboerderij en dagbesteding' would amost certainly be written as 'De Binnentuin'. For hospitals, some optimizations have been done already, but for non-hospitals (healthcare_institutions.txt
) there are probably some more optimizations to be done, without impacting performance and false positives too much.