How to avoid unwanted matches when hyphen?
jackmen opened this issue · 1 comments
jackmen commented
HI,
first of all I really like flashtext!!! I use it in a project and I came across this issue and have not been able to find a way to avoid this behavior:
from flashtext import KeywordProcessor
keyword_processor = KeywordProcessor(case_sensitive=False)
keyword_processor.add_keyword('Art')
keywords_found = keyword_processor.extract_keywords('state-of-the-art')
keywords_found
['Art']
I don't want flashtext to match cases like this one.
I appreciate your answer,
jackmen commented
Ok,
I found how to avoid this behavior using
keyword_processor.non_word_boundaries.update(["-", "/"])
Must have overlooked it in the doc!
Thanks anyways!