This is a CS4248 Project done by Team 15.
Every second, an average of 6,000 tweets are posted on Twitter, with many indicating some form of emotion. With these tweets, we hope to accurately determine the emotions embedded within them using natural language processing (NLP) techniques and be able to generalise the emotional sentiments attached to different topics. Using this model, we aim to design a tool that generates sentiment reports which could prove to be useful in the work of researchers. With that in mind, we conceptualised and implemented our own preprocessing methods and compared between Support Vector Machine, Multinomial Naive Bayes, Random Forest and k-Nearest Neighbours classifiers to build, train and tune our model.
Please refer to our project report for the results of our analysis.
To start our model, the following pre-requisites are needed:
First, clone our repository by running this command:
git clone https://github.com/grrrrnt/notion-emotion-twitter.git
Second, download the respective libraries by running this command in the root directory:
pip3 install -r requirement.txt
Lastly, with the data file in text_emotion.csv, run our models with this command:
python3 notion_emotion_twitter.py
- To compare the performance between models, uncomment
line 384
of notion_emotion_twitter.py and comment outline 385
instead. - To modify the model that is being run, refer to
line 234
of notion_emotion_twitter.py and change it accordingly. - To select the different features, refer to
line 238
of notion_emotion_twitter.py and change it accordingly. - To test the RF model on unseen dataset, uncomment
line 385
of notion_emotion_twitter.py and comment outline 384
instead. - To change the unseen dataset, refer to
line 355
of notion_emotion_twitter.py and change it accordingly.
The datasets that we have used have been obtained from Kaggle: