/Capstone_project_4

Unsupervised machine learning

Primary LanguageJupyter Notebook

Capstone_project_4

image

Project Name- NETFLIX MOVIES AND TV SHOWS CLUSTERING

Problem statement-

Netflix is a popular streaming platform that offers a vast collection of movies and TV shows to its users. With so many options available, it can be challenging for users to find content that aligns with their preferences. To address this issue, we aim to develop a recommendation system that suggests movies and TV shows to users based on their viewing history, ratings, and other attributes. The goal of this project is to design and implement a recommendation algorithm that can accurately predict the content preferences of users and provide personalized recommendations. The effectiveness of the recommendation system will be evaluated based on user engagement and satisfaction metrics. The results of this project can help Netflix to improve user retention and increase customer satisfaction.

Data Description -

  • show_id : Unique ID for every Movie / Tv Show

  • type : Identifier - A Movie or TV Show

  • title : Title of the Movie / Tv Show

  • director : Director of the Movie

  • cast : Actors involved in the movie / show

  • country : Country where the movie / show was produced

  • date_added : Date it was added on Netflix

  • release_year : Actual Releaseyear of the movie / show

  • rating : TV Rating of the movie / show

  • duration : Total Duration - in minutes or number of seasons

  • listed_in : Genere

Conclusion

  • Based on the elbow and silhouette scores, it was determined that 26 clusters were optimal and K Means was better than Hierarchical for identifying clusters.

  • Cluster 3 had the highest number of data points, and the other clusters were evenly distributed.

  • Regarding the content on Netflix, there are more movies than TV shows, with a total of 5372 movies and 2398 TV shows.

  • TV-MA is the most common rating for TV shows, indicating that they are mostly targeted at adult audiences.

  • The highest number of movies were released in 2017, 2018, and 2020.

  • The number of movies on Netflix is growing at a faster rate than TV shows, with a significant increase in the number of movies and television episodes added after 2015. However, there was a drop in the number of movies and television episodes produced after 2020, suggesting that Netflix has focused more on increasing movie content than TV shows.

  • Content is added to Netflix mostly between October and January, and documentaries are the most popular genre, followed by stand-up comedy, dramas, and international movies. For TV shows, kids' shows are the most popular genre. Most movies on Netflix have a duration of between 50 and 150 minutes, and the highest number of TV shows consist of a single season.

  • Movies rated NC-17 have the longest average duration, while movies rated TV-Y have the shortest runtime on average. The United States has the highest amount of content on Netflix, followed by India, which also has the highest number of movies on the platform.

  • Finally, it was found that 30% of the movies on Netflix were released on the platform, while 70% were added after being released elsewhere.