Primary LanguageJupyter Notebook


Last edit : 29 July 2019

Recommender system referencing KPRN, original github, trained using custom MovieLens-20M dataset.

The model implemented has slight difference where no pooling layer added at the end of LSTM.

Domain of problems

Given a path between user and an item, predict how likely the user will interact with the item


  • /cache : temporary files used in training

  • /data : contains dataset (custom ml-20m dataset where only movies shows up in Ripple-Net's knowledge graph used) used in training.

  • /log : contains training result stored in single folder named after training timestamp.

  • /test : contains jupyter notebook used in testing the trained models

  • KPRN-LSTM.ipynb : notebook to train model

  • main.py : python3 version of KPRN-LSTM.ipynb


*italic* means this folder is ommited from git, but necessary if you need to run experiments
**bold** means this folder has it's own README, check it for detailed information :)


Installing dependencies

pip3 install -r requirements.txt

How to run

  1. Unzip ratings_re.zip and ratings_re.z01 in /data

  2. To preprocess, run Preprocess.ipynb notebook or preprocess.py

    python3 data/preprocess.py
  3. To train, run KPRN-LSTM.ipynb notebook or main.py

    python3 main.py

! Caching warning !

To start using new dataset, or if you wish to generate new dataset, please delete all items inside /cache


How to change hyper parameter

Open KPRN-LSTM.ipynb or main.py and change the model parameters

Testing / Evaluation

How to check training result

  1. Find the training result folder inside /log (find the latest), copy the folder name.
  2. Create copy of latest jupyter notebook inside /test folder.
  3. Rename folder to match a folder in /log (for traceability purpose).
  4. Replace TESTING_CODE at the top of the notebook.
  5. Run the notebook

Final result

KPRN - pool_size = 1 (no pooling)

Evaluation size Prec@10 Distinct@10 Unique items
Eval on 10 user 0.12028 0.70000 70
Eval on 30 user 0.16667 0.60667 182
Eval on 100 user 0.17471 0.38600 386

KPRN - pool_size = 3

Evaluation size Prec@10 Distinct@10 Unique items
Eval on 10 user 0.20000 0.32000 32
Eval on 30 user 0.24333 0.21000 63
Eval on 100 user 0.25864 0.13400 134

KPRN - pool_size = 5

Evaluation size Prec@10 Distinct@10 Unique items
Eval on 10 user 0.18667 0.31000 31
Eval on 30 user 0.18000 0.14667 44
Eval on 100 user 0.23453 0.08400 84


KPRN relies heavily upon paths, and those paths are handcrafted by using the knowledge-graph. The paths are also sampled from hundreds of million possible paths.

  • To find paths

    (user -> seed item (eg: Castle on The Hill) -> entity (eg: Ed Sheeran) -> suggestion (eg: Perfect)) 

    from each seed, we can extract millions of paths (if not sampled), even after sampled using only one relation (eg: same artist, same albums, etc per seed, it still generates around 8k-10k path per seed.

  • Each user has multiple item work as seed (typically 20+), this need to be sampled again to reduce paths generated and reduce computational cost.

  • We do make sure each item in suggestion has about 4 - 7 paths

  • At this point, we only evaluate on around 75 - 150 path per user, out of possible hundred million possible paths

  • That's a huge possible source of sampling bias, but at the same time, it's kinda impossible to search through all paths.

  • Looking from the result of KPRN, the usage of KG might turn out to be quite promising, especially to improve the diversity of suggestion.

  • The downside of using KPRN is that the result rely heavily on 'handcrafted' paths, which undergoes a lot of downsampling steps.

  • Summary compared to non-KG RecSys: Big improvement in terms of Prec@k and distinct rate


  • Able to incorporate Knowledge Graph as another source of information
  • Able to infer why a user is given such suggestions (based on path scores)
  • Able to adjust between 'exploration and optimization' by applying result pooling (the model doesn't require to be re-trained)
  • During training, the model converge really fast (< 10 epochs)


  • No original implementation usable
  • Relies heavily upon paths, and those paths are 'handcrafted' by using the knowledge-graph and also sampled from hundreds of million possible paths.
  • Huge possible sampling bias introduced from preprocessing step and path generation step.
  • The model remember the user, the model need to be re-trained for every new user and item addition.
  • Require relatively slow preprocessing
  • Super slow train and prediction time
  • Loss function and metric used in training is not Prec@K, instead it uses accuracy.

Experiment notes

  • At the cost of slightly different implementation, it's easier to implement using high-level libraries such as Keras instead of using original version.
  • path generation & predict time : about 4k path / second
  • different sampling method and sampling parameter has insignificant effect
  • Using more items as path generation 'seed' (for predicting suggestion), should lead to more personalized suggestions. (i.e. consider the suggestion by using more user history)
  • By pooling path prediction-score for the same items, the model should be able to give a better suggestion since it considers multiple reasons instead of just a single reason.

