vi3k6i5/flashtext

How to avoid unwanted matches when hyphen?

jackmen opened this issue · 1 comments

HI,

first of all I really like flashtext!!! I use it in a project and I came across this issue and have not been able to find a way to avoid this behavior:

from flashtext import KeywordProcessor
keyword_processor = KeywordProcessor(case_sensitive=False)
keyword_processor.add_keyword('Art')
keywords_found = keyword_processor.extract_keywords('state-of-the-art')
keywords_found
['Art']

I don't want flashtext to match cases like this one.

I appreciate your answer,

Ok,

I found how to avoid this behavior using

keyword_processor.non_word_boundaries.update(["-", "/"])

Must have overlooked it in the doc!

Thanks anyways!