ChatSentimentAnalysis

Sentiment analysis of chat data including text, smileys, emojis and images (gifs) with the included difficuly of sarcasm.

Emoji sentiment using Emoji Sentiment Ranking.

Image sentiment using C3D, a 3D-CNN.

Text sentiment using DeepMoji finetuned on SS-YouTube and SS-Twitter.

Installation
SentimentAnalysis.py
Emoji
Image
Text
Performance
FAQ
Support
Additional Notes

Installation

Download DeepMoji SS-Twitter Keras Model and DeepMoji SS-YouTube Keras Model and place them in SentimentAnalysis/Text/sentiment/finetuned.
Download DeepMoji Weights and place them in SentimentAnalysis/Text/model.
Download C3D Sentiment Model and place in SentimentAnalysis/Image.

SentimentAnalysis.py

See testSentimentAnalysis.py for an example.
Main implementation of sentiment analysis.

Emoji

See testEmojiSentiment.py for an example.
EmojiSentiment.py: Extract sentiment from emojis.
config.py: Contains emoji to sentiment mappings.
build: Build files used to generate emoji sentiment mappings.

Image

See testImageSentiment.py for an example.
ImageSentiment.py: Extract sentiment from images(gifs).
training: Files related to training the classification model.

Text

See testTextSentiment.py for an example.
Contains modified version of DeepMoji Python 3 repo.
sentiment/TextSentiment.py: Extract sentiment from text and smileys.
sentiment/build: Example of manualy entered training data for finetuning.
sentiment/finetuned: Finetuned Keras models used for classification.

Performance

Performance of the model was tested on 100 tweets containing emojis from this dataset.
This paper showed that emoticon blocking (using emoji sentiment as overall sentiment indicator for a sentence) proved to be an effective method of sentiment detection.
This was tested and the results can be observed below.

Emoticon Blocking

Text Only

Emoticon blocking appears to perform better on this small dataset which would suggest it would also perform better on a larger dataset.
Emoticon blocking also handles sarcasm where, for example, I hate it when you do that 😉 is considered positive overall, where as it would be classified as negative if only the text was considered.

C3D Image Sentiment Model

5000 images used in training with 2500 of each class.
Model trained and evaluated on balanced data set using a training/validation split of 70/30.
Below is an example of the top negative and positive images from the validation data.

Top Negative

Top Positive

FAQ

How is sentiment calculated?
- Text, emojis and images are extracted from a chat sentence like the example given below.
- we lost 😒 😅 😛 <img>https://media.giphy.com/media/2rtQMJvhzOnRe/giphy.gif</img>
- Sentiment for each is calculated and a score is returned based on the rules in SentimentAnalysis.calculate_scores().
How was the image model trained?
- The C3D Model was finetuned using imges from GIPHY.
- Labelled GIPHY images were obtained from GIFGIF.
- An R script to extract the links from JSON has been included in this project.
- Text files are generated containing links to the images which can be downloaded from terminal using cat file_name.txt | parallel --gnu "wget {}".
- Or the exact files used for training and validation can be downloaded here.
Why is there no sentiment score for some emojis?
- In my opinion, not all emojis are good indicators of sentiment.
- Only emojis with obvious indicators of sentiment such as facial expressions, popular symbols and hand signs were used.
But what about sarcasm detection?
- DeepMoji has learned to understand emotions and sarcasm based on millions of emojis.
- Whether the text contains sarcasm or not is irrelevant, the features extracted using DeepMoji still accurately represent the emotions in the text. These features are used when finetuning new models.
- Using emoticon blocking also helps to calculate the actual sentiment in cases of sarcasm e.g. I hate it when you do that 😉 is actually positive and contains a positive emoji but negative text. Emoticon blocking considers the emoji as the overall sentiment which would be correct in this case.
Why was the model only evaluated on combinations of emojis and text?
- The sentiment of images is only used if there is no sentiment available for emojis or text in a sentence. The accuracy of this is the same as when the image model was evaluated individually.

Support

Email: oisin097@hotmail.com

Additional Notes

Sentiment analysis from sarcasm detection could be directly tackled using DeepMoji. DeepMoji outlined in their paper to have been finetuned on SCv2-GEN to perform sarcasm detection at 75% accuracy. Text could be first passed through this model to detect sarcasm and then the sarcastic sentences passed to another model finetuned on sentiment labelled sentences containing sarcasm.
In my opinion, this approach would not be very accurate as the compounding decrease in classification accuracy (is this sarcastic? -> is this positive/negative?) (75% x ??%) would most likely result in a poor result. There is also an increase in computational overhead which enforces the fact this trade off would most likely not be worthwhile.

ushmau5/ChatSentimentAnalysis

ChatSentimentAnalysis

Table of Contents

Installation

SentimentAnalysis.py

Emoji

Image

Text

Performance

Emoticon Blocking

Text Only

C3D Image Sentiment Model

Top Negative

Top Positive

FAQ

Support

Additional Notes