- A Jupyter notebook
student.ipynb
showing our tweet sentiment analysis - The dataset itself Obtained from brands-and-product-emotions
- A PowerPoint presentation of the data
- Which model will perform best?
- How well will a model perform?
- What can androids or iphones improve in their products to reduce negative feedback?
- Obtain the data
- Scrub the data
- Explore the data
- Model the data
- Interpret the data
- Reference
- From the model:
- Slightly overfitting the training data
- Odd that the True Label(1) doesn't add up to 100%
- The best model overall at 79% accuracy
- Successfully predicting 64% and 82% of Negatives and Positives respectively
- From the model:
- Overfitting the training data at 98%
- Predicting 56% and 91% of Negatives and Positives Respectively
- 86% accuracy on the Testing data
- Still struggling to predict the negative emotion, slightly better than flipping a coin in that aspect
- Which model will perform best?
- Gradient Boosting for balanced recall, and stacking for the highest overall accuracy
- How well will a model perform?
- Overall Testing accuracy
- Gradient Boosting: 76%
- Stacking Classifier: 86%
- What can androids or iphones improve in their products to reduce negative feedback?
- Android
- Contests
- Events
- Iphone
- Interface
- Ease of use
- Build an sklearn pipeline with and grid search with the tokenizer and vectorizer parameters along with the parameters for each of the classifiers.
- Build neural networks for the data, using Oscar for hyperparameter tuning
- More tuning on the above models
| data
\--- tweet.csv
|
| images
\--- phone.jpg
|
| presentation.pdf
| README.md
| student.ipynb
| readme.ipynb
|
\--- styles
| custom.css
\--- md
| student.md
| .png files of all graphs