This repository contains the code and resources for the Unsupervised Learning Sprint challenge, where the goal is to construct a recommendation algorithm based on content and/or collaborative filtering, capable of accurately predicting how a user will rate a movie they have not yet viewed based on their historical preferences. The challenge can be found here, and the accompanying notebook serves as our entry.
Welcome to our repository, where we have embarked on a mission to create an advanced movie recommender system.
In the rapidly evolving world of technology, the significance of recommender systems cannot be overstated. These systems play a crucial role in helping individuals navigate the vast ocean of content available to them, making personalized recommendations that align with their preferences. One area where this becomes especially pertinent is in the domain of movie content recommendations, where viewers are presented with an overwhelming selection of titles from various streaming platforms.
Consider streaming giants like Netflix, Amazon Prime, Showmax, and Disney. Have you ever wondered how they manage to accurately suggest content that appeals to your tastes and interests?
The answer lies in the intelligent algorithms working behind the scenes. These algorithms employ techniques like content-based or collaborative filtering to analyze historical user preferences and predict how users would rate movies they have not yet seen.
We invite you to join us in this challenge to investigate and construct our very own recommendation algorithms based on both content and collaborative filtering. The goal is to accurately predict how a user will rate a movie they have not yet seen, solely based on their historical viewing preferences.
The value derived from solving this challenge is immense. A successful recommender system has the potential to transform the user experience, allowing viewers to effortlessly discover movies that resonate with their preferences. By providing personalized and engaging recommendations, we can enhance platform affinity, content consumption, and ultimately generate revenue for the streaming service
Join us in this journey to shape the future of content discovery and revolutionize the way users connect with movies through innovative recommender systems
The repository contains the following resources:
-
Resources
: A folder containing supporting images used for explanation in the notebook -
FinalNotebook_Team_NM4.ipynb
: The notebook in which the preprocessing and modelling takes place. This is also used to produce files for submission to the competition
The following libraries need to be installed, but can be done so by following the steps in the notebook:
-
wordcloud
-
comet_ml
-
currencyconverter
-
nltk
-
scikit-surprise
To reproduce the results or use the provided code, follow these steps:
- Clone the repository:
https://github.com/Kabous0017/Unsupervised-Learning-Team-NM4-VersionControl.git
- Download the dataset used to produce our models here. Save these files in the same directory as the
FinalNotebook_Team_NM4.ipynb
file. - Download supplementary datafiles titled
title.crew.tsv.gz
andname.basics.tsv.gz
from here - Unzip the
title.crew.tsv.gz
andname.basics.tsv.gz
files, and make sure they are copied to theResources
folder accompanying the notebook - Explore the Jupyter notebook
FinalNotebook_Team_NM4.ipynb
to understand the data pre-processing, feature engineering, and model development steps.
The notebook sometimes fails to render on GitHub, due to the size and images contained within. If this is the case, please find a fully rendered version [here] (https://nbviewer.org/github/Kabous0017/Unsupervised-Learning-Team-NM4-VersionControl/blob/06aa14a0fa0f6972afb953cb32029e1d99b3f829/FinalNotebook_Team_NM4.ipynb)