/recommender-system-capstone

Primary LanguageJupyter NotebookMIT LicenseMIT

Escape the bubble!

A trade-off between personalization and diversity in movie recommendations


This capstone project is the graduation work of four students of the Neue-Fische Bootcamp in Data Science, between September and December 2021.


Project description:

Recommender systems are both a blessing and a curse. On the one hand, they provide the user with recommendations, tailored to his/her personal profile. On the other hand, the user is at risk of getting stuck in a personalized filter bubble. A filter bubble gradually isolates a user by decreasing the diversity of recommendations. Common recommender engines are content-based and collaborative filtering. While content-based filtering suggests recommendations based only on the similarity of items, collaborative filtering takes user ratings into account. Depending on the information the users have given so far, these recommendations may still be of little variation. In this project, we developed a recommender engine that produces personalized recommendations for movies. Further, we implemented an algorithm which gives the user the choice, how far he/she wants to leave the filter bubble. With this increased diversity, we provide a better experience for the user.

Project team:

  • Alexandra Zimmermann-Rösner, Dipl.- Biologist, GitHub
  • Hassan Mohamed, MSc., Physicist, GitHub
  • Konstanze Braun, MSc., Psychologist, GitHub
  • Jana Conradi, MSc., Bio Scientist, GitHub


Graduate Presentation: pdf, YouTube_video,Dashboard


Data

To build the movie recommendations, following datasets were used:

  • Dataset from MovieLens
    Small: 100,000 ratings and 3,600 tag applications applied to 9,000 movies by 600 users. Last updated 9/2018. Provided by the University of Minnesota
    Link
  • TMDB movie metadata, uploaded on Kaggle
    Link

The main project notebooks:


Requirements

pyenv with Python: 3.9.4

Installation with PIP

Surprise Library

pip install scikit-surprise

NLTK

pip install nltk

Pandas-Profiling

pip install -U pandas-profiling[notebook]
jupyter nbextension enable --py widgetsnbextension

Environment

make setup 

#or 

pyenv local 3.9.4
python -m venv .venv
pip install --upgrade pip
pip install -r requirements.txt
source .venv/bin/activate

The general flowchart of the project can be illustrated in this diagram:


Alt text