
This repository contains code & documentation for a data wrangling group project on the Kaggle dataset Netflix Movies and TV Shows (

Team Members

  1. Ansh Gupta
  2. Bo Qin
  3. Chinaza Nmam
  4. Digvijay Yadav
  5. Era Wu

Our Approaches

  1. Cleaning Dataset using Microsoft Excel
  2. Generating Word Clouds
  3. Sentiment Analysis
  4. Developing a model for Genre Prediction.

Libraries Used

  1. tidytext
  2. stringr
  3. readxl
  4. wordcloud
  5. wordcloud2
  6. RColorBrewer
  7. dplyr
  8. tidyr
  9. ggplot2
  10. Tensorflow
  11. Sklearn
  12. Math
  13. re
  14. NKTK
  15. Seaborn, Pandas, Numpy, tqdm
  16. syuzhet (for the get_nrc_sentiment) function

Languages Used

  1. R Programming - Used for Sentiment Analysis and WordCloud generation.
  2. Python3 - Used for Genre Prediction.
  3. Microsoft Excel - Primary tool for cleaning our Netflix Data.