vivekn/sentiment

negate_sequence() function buggy.

josmithua opened this issue · 4 comments

The problems that I see are in:

if any(neg in word for neg in ["not", "n't", "no"]):
    negation = not negation
1st problem: negate_sequence() negates words after word containing substring "no", "not", "n't"

For example,

negate_sequence("I know it's going to be nice today")

returns:

['i', 'know', 'i know', "not_it's", "know not_it's", "i know not_it's", 'not_nice', "not_it's not_nice", "know not_it's not_nice", 'not_today', 'not_nice not_today', "not_it's not_nice not_today"]

due to the fact that "know" contains the substring "no". "n't" is not really a problem because it usually comes at the end of a word, but matching on "not" presents issues as well.

2nd problem: You should be comparing with the .lower() version of the word.

For example,

negate_sequence("I DON'T like this movie")

will not negate anything because it's only checking for "n't", not "N'T" or any other case variation. Same thing for a text like "No one with half a brain would watch this movie more than once", because the "No" doesn't match "no".

@sm1th Can you submit a pull request for the these problems? Thanks for the heads up on problem #2, for now I'll lower my input to the service.

@bfdill @vivekn Any chance this pull request will get merged?

@sm1th Any idea on how to generate the pickle file trained see and countdata and reduceddata.pickle see

Please help in running this code. 🙏

Kindly, guide.

@mitend I'm sorry I can't help you. Consider opening another issue.