Movie Recommendation System

Overview

This repository contains code and data for building a movie recommendation system. I designed the system to recommend movies based on user preferences and movie attributes. In this README, I will provide an overview of the data preprocessing steps and the structure of the code.

Data

The data used in this project consists of two main datasets: credits.csv and movies.csv. Here is some basic information about these datasets:

credits.csv: Contains information about the cast and crew of each movie.
- Shape: (4803, 4)
- Columns: 'movie_id', 'title', 'cast', 'crew'
movies.csv: Contains information about movies, including titles, overviews, genres, keywords, and original language.
- Shape: (4803, 20)
- Columns: 'movie_id', 'title', 'overview', 'genres', 'keywords', 'cast', 'crew', and more.

Data Preprocessing

I performed several preprocessing steps on the data to prepare it for building the recommendation system. Here are the key preprocessing steps:

Handling Missing Values: I removed rows with missing values in the 'overview' column.
Data Cleaning: I cleaned the text data in the 'overview' column by removing punctuation and converting text to lowercase.
Feature Engineering: I extracted relevant features from the data, such as genres, keywords, cast, and crew, and transformed them into tags.
Tag Generation: Tags were generated by combining information from different columns, such as the movie overview, genres, cast, crew, and keywords.
Tag Normalization: All tags were converted to lowercase for consistency.

aryamanan/Recommendation_System

Movie Recommendation System

Overview

Data

Data Preprocessing