This repository contains code & documentation for a data wrangling group project on the Kaggle dataset Netflix Movies and TV Shows (https://www.kaggle.com/datasets/shivamb/netflix-shows).
Team Members
- Ansh Gupta
- Bo Qin
- Chinaza Nmam
- Digvijay Yadav
- Era Wu
Our Approaches
- Cleaning Dataset using Microsoft Excel
- Generating Word Clouds
- Sentiment Analysis
- Developing a model for Genre Prediction.
Libraries Used
- tidytext
- stringr
- readxl
- wordcloud
- wordcloud2
- RColorBrewer
- dplyr
- tidyr
- ggplot2
- Tensorflow
- Sklearn
- Math
- re
- NKTK
- Seaborn, Pandas, Numpy, tqdm
- syuzhet (for the get_nrc_sentiment) function
Languages Used
- R Programming - Used for Sentiment Analysis and WordCloud generation.
- Python3 - Used for Genre Prediction.
- Microsoft Excel - Primary tool for cleaning our Netflix Data.