/movies_recommendation

A movie recommendation engine from written reviews using collaborative filtering based on low-rank matrix factorization implemented in Python. The raw data is available at http://www.cs.cornell.edu/people/pabo/movie-review-data/

Primary LanguageJupyter Notebook

A Movies Recommendation System

In this repository is developed a simple recommendation engine for movies using collaborative filtering from lowrank matrix factorization inplemented in python. This project is based on data avaliable in the in this movie reviews repository.

Extracting Information from the Data Set.

Ratings of many movies given for four reviewers (users) are avaliable in the path /home/dell/movies_recommendation/cornell-data/scale_data, however, they do not make reference to the movie title they are rating, instead they make reference to natural language texts avaliable in movies_recommendation/cornell-data/scale_whole_review/, with the title immerse on it. Due to this fact, text must be analyzed to get the movie title of each review.

Development guide

Read this guide to get details about how this system was developed.

How to execute it

This movie recommendation system is implemented in a python script. So a complete python 3 installation is necessary (however, an alternative using Docker is explained bellow). The following command executes the movies recommendation engine for a user identifyed as Steve+Rhodes, making 6 recommendations, and showing the top 8 historical ratings.

python recommender.py '{"user id":"Steve+Rhodes", "Recommendations limit": 5, "Historical limit":8}'

Running it in Docker

If you don't have a full python installation (e.g. one with anaconda 3) run the next lines after install Docker.

sudo docker build -t movies_recomender .
sudo docker run -ti -v $(pwd)/:/tmp movies_recomender python recommender.py '{"user id":"Steve+Rhodes", "Recommendations limit": 5, "Historical limit":8}'

References

  1. Generals on movies recommendation systems.
  2. Matrix factorization recommender.
  3. A movie recommendation system inplemented on Spark.
  4. About the Netflix recommendation system.
  5. Performance metrics.
  6. Movie ratings dataset.