This project aims to analyze Netflix data to gain insights into the platform's available TV shows and movies. The dataset used for analysis contains information about titles, including their release year, description, genres, and ratings from various sources such as IMDb and TMDB. The project covers exploratory data analysis (EDA), feature extraction, and predictive modeling tasks.
The dataset used in this project is sourced from Kaggle. It consists of two files:
- titles.csv: Contains information about the titles available on Netflix, including their ID, title, show type (TV show or movie), description, release year, age certification, genres, production countries, seasons (for TV shows), and ratings.
- credits.csv: Contains information about the cast and crew of the titles, including their IDs, names, character names, and roles.
The project is organized into the following directories and files:
To replicate the analysis and run the project:
- Clone the repository to your local machine:
git clone https://github.com/gladyswambura/The-Netflix-Oracle-Predicting-Your-Next-Favorite-movie.git
- Explore the Jupyter notebooks in the notebooks/ directory for EDA, feature extraction, and predictive modeling tasks.
- Gladyswambura (@gladyswambura): Project lead.
- Code3 camp students
This project is licensed under the MIT License. Please take a look at the LICENSE file for details.