negate_sequence() function buggy.
josmithua opened this issue · 4 comments
The problems that I see are in:
if any(neg in word for neg in ["not", "n't", "no"]):
negation = not negation
1st problem: negate_sequence() negates words after word containing substring "no", "not", "n't"
For example,
negate_sequence("I know it's going to be nice today")
returns:
['i', 'know', 'i know', "not_it's", "know not_it's", "i know not_it's", 'not_nice', "not_it's not_nice", "know not_it's not_nice", 'not_today', 'not_nice not_today', "not_it's not_nice not_today"]
due to the fact that "know"
contains the substring "no"
. "n't"
is not really a problem because it usually comes at the end of a word, but matching on "not"
presents issues as well.
2nd problem: You should be comparing with the .lower() version of the word.
For example,
negate_sequence("I DON'T like this movie")
will not negate anything because it's only checking for "n't"
, not "N'T"
or any other case variation. Same thing for a text like "No one with half a brain would watch this movie more than once"
, because the "No"
doesn't match "no"
.