davmixcool/php-sentiment-analyzer

Negated positive words returning positive sentiments

Closed this issue · 2 comments

Hi.

I'm working on a project that relies on your library. I'm expecting to be getting some negated input such as "Not good", "Not too good", "Not the best", "Not bad"... I'm not sure if I'm doing something wrong, however these inputs are returning scores that are not expected.

For instance, "Not good" returns the following score:

{"neg":0,"neu":0.256,"pos":0.744,"compound":0.4404}

"not good" returns this one:

{"neg":0.609,"neu":0.391,"pos":0,"compound":-0.1423}

"Not great":

{"neg":0,"neu":0.196,"pos":0.804,"compound":0.6249}

And "not great":

{"neg":0.656,"neu":0.344,"pos":0,"compound":-0.2283}

Clearly it has something to do with the capitalization of the word "not", however I'm expecting these negated positive words to return a negative sentiment, regardless of capitalization since it is grammatically correct to start sentences with a capital letter.

Is this configurable somehow, somewhere, or is it part of the engine? I.e, can I, as the user of the library, do something about it?

Thanks in advance!

@davmixcool I went diving into the source code a little bit and discovered that the Config::NEGATE contains only lowercase negators. This is the reason Not good is not behaving the same way as not good, Not is not detected as a negator whilst not is. Is this by design, or is there room for improvement here? The simplest fix I can think of, that is also an isolated change, is converting the word to lowercase inside the Analyzer::isNegated method.

@davmixcool I went diving into the source code a little bit and discovered that the Config::NEGATE contains only lowercase negators. This is the reason Not good is not behaving the same way as not good, Not is not detected as a negator whilst not is. Is this by design, or is there room for improvement here? The simplest fix I can think of, that is also an isolated change, is converting the word to lowercase inside the Analyzer::isNegated method.

There's room for improvement here. You can send a pull request for this fix. Thanks for pointing this out.