AlmaBetter Capstone Project 4 - Unsupervised: Netflix Movies And Tv Shows Clustering

This project is a part of the Almabetter Pro Program Curriculum(Full Stack Data Science)

Project Status: [Completed]

Score: - [ 100 / 100 ]

Project Summary :

Problem Statement :

This dataset consists of tv shows and movies available on Netflix as of 2019. The dataset is collected from Flixable which is a third-party Netflix search engine.
In 2018, they released an interesting report which shows that the number of TV shows on Netflix has nearly tripled since 2010. The streaming service’s number of movies has decreased by more than 2,000 titles since 2010, while its number of TV shows has nearly tripled. It will be interesting to explore what all other insights can be obtained from the same dataset.
Integrating this dataset with other external datasets such as IMDB ratings, rotten tomatoes can also provide many interesting findings.

About the Data :

We have the data of which contains details of customers like id , age, gender and also contains the details of the customers vehicle

Dataset info

    1. Number of records: 7787
  • 2.Number of features: 12

Features information:

The dataset contains features like:

  • show_id : Unique ID for every Movie / Tv Show
  • type : A Movie or TV Show
  • title : Title of the Movie / Tv Shows
  • director : Director of the Movie
  • cast : Actors involved in the movie / show
  • country : Country where the movie / show was produced
  • date_added : Date it was added on Netflix
  • release_year : Actual Release year of the movie / show
  • rating : TV Rating of the movie / show
  • duration : Total Duration - in minutes or number of seasons
  • listed_in : Generes
  • description: The Summary description

Project Work flow

  1. Importing Libraries
  2. Loading the dataset
  3. Data Summary
  4. Data Cleaning & Data Analysis
  5. Feature selection
  6. Implementing different clustering methods
  7. Conclusion

Future Work

From this clustering analysis we can create Netflix movies and tv shows recommendation systems & also we can use topic modelling.

Miscellaneous :

  • Google colab tools