1 Model Selection for Tweet Sentiment Analysis

Model Selection for Tweet Sentiment Analysis

This Repository Contains

A Jupyter notebook student.ipynb showing our tweet sentiment analysis
The dataset itself Obtained from brands-and-product-emotions
A PowerPoint presentation of the data

Questions

Which model will perform best?
How well will a model perform?
What can androids or iphones improve in their products to reduce negative feedback?

Using the OSEMN Process

Obtain the data
Scrub the data
Explore the data
Model the data
Interpret the data
Reference

Results

Gradient Boosting Classifier

From the model:

Slightly overfitting the training data
Odd that the True Label(1) doesn't add up to 100%
The best model overall at 79% accuracy
Successfully predicting 64% and 82% of Negatives and Positives respectively

Stacking Classifier

From the model:

Overfitting the training data at 98%
Predicting 56% and 91% of Negatives and Positives Respectively
86% accuracy on the Testing data
Still struggling to predict the negative emotion, slightly better than flipping a coin in that aspect

Recommendations

Which model will perform best?

Gradient Boosting for balanced recall, and stacking for the highest overall accuracy

How well will a model perform?

Overall Testing accuracy

Gradient Boosting: 76%

Stacking Classifier: 86%

What can androids or iphones improve in their products to reduce negative feedback?

Android

Contests

Events

Iphone

Interface

Ease of use

Next Steps

Build an sklearn pipeline with and grid search with the tokenizer and vectorizer parameters along with the parameters for each of the classifiers.
Build neural networks for the data, using Oscar for hyperparameter tuning
More tuning on the above models

Repository Structure

|   data
\--- tweet.csv
|
|   images
\--- phone.jpg
|
|   presentation.pdf
|   README.md
|   student.ipynb
|   readme.ipynb
|       
\--- styles
|      custom.css

\--- md
|      student.md
|      .png files of all graphs

skelouse/tweet-sentiment-analysis

Table of Contents

Model Selection for Tweet Sentiment Analysis

This Repository Contains

Questions

Using the OSEMN Process

Results

Gradient Boosting Classifier

Stacking Classifier

Recommendations

Next Steps

Repository Structure