tokenmill/dictionary-annotator

Accent insensitive matching

Closed this issue · 0 comments

Implement feature to match accent insensitively.

Example form UIMA mailing list:

My point is : I have lists containing elements like « événement » and I would like text like « EVENEMENT » or even « évènement » to match that list. Lowercasing texts is not a solution, as « é » is mapped to uppercase « É » in French locale, which has nothing to do with « e ».