/FLIX-HUB-An-Advanced-Movie-Recommendation-System-Leveraging-Netflix-Data

FLIX-HUB is a movie recommendation system utilizing the Netflix dataset. It features comprehensive data preprocessing and analysis, generating personalized movie and TV show suggestions based on TF-IDF vectorization and cosine similarity. The project includes interactive visualizations for insights into content trends and distributions.

Primary LanguageJupyter NotebookMIT LicenseMIT

FLIX-HUB: An Advanced Movie Recommendation System Leveraging Netflix Data

Python Pandas NumPy Matplotlib Plotly scikit-learn Jupyter Notebook

Netflix Intro Animation

Table of Contents

Project Overview

This project performs a comprehensive analysis of Netflix data and implements a content-based recommendation system called FLIX-HUB. It includes data preprocessing, exploratory data analysis, advanced feature engineering, and a movie/TV show recommendation engine.

Features

  • Data cleaning and preprocessing
  • Exploratory data analysis with interactive visualizations
  • Text processing and feature engineering
  • Content-based recommendation system
  • Support for both movies and TV shows

Installation

To run this project, you need to have Python installed on your system. Then, follow these steps:

  1. Clone the repository:

    git clone https://github.com/yourusername/Recommendation-System .git
    
  2. Navigate to the project directory:

    cd movie-recommendation-system.ipynb
    
  3. Install the required packages:

    pip install -r requirements.txt
    

Usage

To use the FLIX-HUB recommendation system:

  1. Run the Jupyter notebook or Python script.
  2. Use the FlixHub class to get recommendations:
flix_hub = FlixHub(final_data, cosine_sim)
movies, tv_shows = flix_hub.recommendation('Movie Title', total_result=10, threshold=0.5)

print('Similar Movie(s) list:')
for movie in movies:
    print(movie)

print('\nSimilar TV_show(s) list:')
for tv_show in tv_shows:
    print(tv_show)

Data Preprocessing

The data preprocessing steps include:

  • Loading the Netflix dataset
  • Handling missing values
  • Cleaning text data (titles, descriptions, etc.)
  • Creating a bag of words representation

Exploratory Data Analysis

The EDA process includes various visualizations:

  • Distribution of content types (movies vs. TV shows)
  • Number of movies released each year
  • Top countries producing Netflix content
  • Movie ratings distribution
  • Word clouds for titles, descriptions, and genres

Feature Engineering

Advanced feature engineering techniques are applied:

  • Text cleaning and normalization
  • TF-IDF vectorization
  • Cosine similarity calculation

Recommendation System

The FLIX-HUB recommendation system uses:

  • Content-based filtering
  • Cosine similarity for finding similar content
  • Separate recommendations for movies and TV shows

Results

The project provides insights into Netflix's content library and offers personalized recommendations based on user input. Some key findings include:

  • Distribution of movies vs. TV shows
  • Trends in content production over the years
  • Popular genres and themes

Contributing

Contributions to this project are welcome. Please fork the repository and submit a pull request with your changes.

License

This project is licensed under the MIT License - see the LICENSE file for details.