vi3k6i5/flashtext

Provide replacement info to replace_keyword API

gabyx opened this issue · 1 comments

gabyx commented

Its desirable to have a clue what was replaced and where, and also if any replacement happend at all.

Hello @gabyx,

I can't think of a direct solution to your problem, but here is one solution which I can think of immediately and would help to solve your issue -

  1. While adding your keywords, construct a dictionary of the keyword (replacing) as key and keyword (to be replaced).
  2. Instead of using replace_keyword, you can first use extract_keyword to see if any keywords which you want to replace have extracted out or not (This would tell you whether any word in a sentence was replaced or not)
  3. Also with extract_keyword, you can do span_info=True to get the position of the replacement.

>>> from flashtext import KeywordProcessor
>>> keyword_processor = KeywordProcessor()
>>> keyword_dict = {'New York' : 'Big Apple', 'Trump' : 'Bay Area'}
>>> keyword_processor.add_keyword('Big Apple', 'New York')
>>> keyword_processor.add_keyword('Bay Area', 'Trump')
>>> keywords_found = keyword_processor.extract_keywords('I love big Apple and Bay Area.', span_info=True)
>>> [(keyword_dict.get(key[0]), key[1], key[2]) for key in keywords_found]
[('Big Apple', 7, 16), ('Bay Area', 21, 29)]

Kind Regards,
Nandan Thakur