/Netflix-EDA

Data exploration and preprocessing on Netflix Dataset

Primary LanguageJupyter Notebook

Netflix EDA

Data exploration and preprocessing on Netflix Dataset

Libraries needed

  • Pandas
  • Numpy
  • Matplotlib
  • Seaborn

In this notebook we use the dataset 'Netflix titles' available at this link

About Dataset

This dataset contains Unlabelled text data of around 9000 Netflix Shows and Movies along with Full details like Cast, Release Year, Rating, Description, etc.

Columns of dataset:

  • show_id Unique ID for every Movie / Tv Show
  • type Identifier a Movie or TV Show
  • title Title of the Movie / Tv Show
  • director Director of the Movie
  • cast Actors involved in the movie / show
  • country Country where the movie / show was produced
  • date_added Date it was added on Netflix
  • release_year Actual Release year of the move / show
  • rating TV Rating of the movie / show
  • duration Total Duration - in minutes or number of seasons

Example of query on dataset

  • Extract min, max, mean, median, std of a column
  • View in a plot TOP 10 genres and TOP 10 actors by appearance
  • Rating comparation
  • Duration statistics
  • ...