Creating a recommender system for articles

In this notebook, we will walk through how to create a simple algorithm that will recommend the top N posts based on our interests. The steps required to do this look like the following:

Fetch N posts from the HN API and aggregate into a list in-memory
Cleanse the data by vectorizing the list and removing stop words
Create a matrix that follows the structure of TF-IDF (term frequency inverse document frequency)
From here, we can vectorize our query and compute the cosine similarity of our input against our model. We will sort and rank the most similar titles to find links of interest.

Simple UI

Setup

Create a virtual environment

virtualenv ml

Install requirements

pip install -r requirements.txt

Run the notebook

jupyter notebook

Run each cell in serial order, the last cell will be the UI. The first cell might take a bit to pull all the results from the API (not super optimized.)

We only need to run that cell once to cache the results so future queries are instantaneous.

Schachte/Hackernews-Recommender-System

Creating a recommender system for articles

Simple UI

Setup