vivekn/sentiment

"oh my god i love brazil" is considered negative with high confidence

Opened this issue · 6 comments

I entered "oh my god i love brazil" into the box on your site, and the result was unexpected:
Result: Negative
Confidence Level: 99.8203

Where did you put the text "oh my god i love brazil" in order to test it?

Thanks

I think, in the training dataset, the occurrences of "oh my god" were used in negative context thereby resulting into such scenario. Also its surprising to see higher accuracy for trigrams as compared to unigram.

That's correct, "oh my god" has a much greater negative weight than the positive weight of "love" in the training set. That said this model works better on longer sequences of text and doesn't do that well on short phrases.

Thanks for your replies. To my mind, "oh my god" might only have a negative meaning when used by itself. In conjunction with other phrases it works like a sentiment amplifier.
I mostly deal with short comments in my work, so could you recommend any APIs that work better on short pieces of text?

if i enter normal text then also said its positive with 100% confidence level
so this is bullshit for NEUTRAL statements