/PBR

Profile-based retrieval project

Primary LanguageJupyter Notebook

Profile Based Retrieval

Building a recommendation system based on ratings given by the users

The dataset used for this project come from MovieLens and is avaiable here and the scope of this project is, at the end, to merge knowledge of Natural Language Process, statistics and programming to come up with a systems that correctly recommends users with movies based on their profile. The project is carried out in Python language with the help of Pandas, Scikit-learn, NLTK and Numpy and the code is available in the following

The github repo contains the following:

  • PBR.ipynb: this file contains the jupyter notebook with the develepment of the movie recommender in Python language.
  • report.pdf: this file contains a summary of the work performed and the conclusions that can be drawn from it

How to run

To run the project, firstly download the MovieLens dataset which link is provided above and rename the downloaded directory as "dataset". This directory should contain the files genome-scores.csv, genome-tags.csv, links.csv, movies.csv, ratings.csv and tags.csv. After that, download this GitHub directory, open "PBR.ipynb" as a jupyter notebook and run it, ensure that the enviroment is opened in the right working directory and "dataset" is present in that directory. During the execution of the code remeber to give as an input the Id of the user when requested.

Libraries to import

  • numpy
  • pandas
  • sklearn
  • nltk