/RecommenderSystems

Recommender Systems and Collaborative Filtering

Primary LanguageJupyter Notebook

Recommender Systems & Collaborative Filtering

Introduction

Recommendation systems are an important application of data science in industry. They are used almost everywhere; examples being recommending items to sell to users on Amazon, songs on Pandora and movies/shows on Netflix. There are two general approaches to recommender systems:

  1. Collaborative filtering

  2. Content based filtering

Collaborative filtering is a method of recommending products to customers using their past behavoirs or ratings as well as similar decisions by other customers to predict which items might be appealing to the original customers. Content-based filtering suggests products to customers by using the characteristics of an item in order to recommend additional items with similar properties. I'll just be touching on collaborative filtering in this blog post since it is very popular and has the ability to accurately recommend complex items without the need to understand the item itself. Collaborative filtering is also much more popular for web-based recommendations where the data is sparse, i.e., where there is a limited number of reviews by each user or for a particular product.

Data

The data we will use comes from Amazon and can be found here. I chose the Amazon Instant Video 5 core file. The 5 core implies that each video/item has atleast 5 ratings and each users has rated atleast 5 videos/items.

Requirements

  1. Python (3.X)
  2. Jupyter Notebook
  3. NumPy
  4. SciPy
  5. matplotlib
  6. Pandas
  7. scikit learn
  8. Seaborn

To install the requirements with pip (except for Python and Jupyter notebooks), type in the main directory:

pip install -r requirements.txt 

Alternatively you can install the dependencies and access the notebook using Docker by building the Docker image with the following:

docker build -t recommend .

Followed by running the command container:

docker run -p 8888:8888 -t recommend