This is the official implementation of the paper KG-TORE: Tailored recommendations through knowledge-aware GNN models accepted at RecSys 2023.
- Description
- Requirements
- Datasets
- Elliot Configuration Files
- Recommendation Lists
- KGTORe Parameters
- Usage
The code in this repository allows replicating the experimental setting described within the paper.
The recommenders training and evaluation procedures have been developed on the reproducibility framework Elliot, so we suggest you refer to the official GitHub page and documentation.
Regarding the graph-based recommendation models based on torch, they have been implemented
in PyTorch Geometric
using the version 1.10.2
, with CUDA 10.2
and cuDNN 8.0
For granting the usage of the same environment on different machines, all the experiments have been executed on the same docker container. If the reader would like to use it, please look at the corresponding section in requirements.
This software has been executed on the operative system Ubuntu 18.04
.
Please, make sure to have the following installed on your system:
- Python
3.8.0
or later - PyTorch Geometric with PyTorch
1.10.2
or later - CUDA
10.2
If you have the possibility to install CUDA on your workstation (i.e., 10.2
), you may create the virtual environment with the requirements files we included in the repository, as follows:
# PYTORCH ENVIRONMENT (CUDA 10.2, cuDNN 8.0)
$ python3.8 -m venv venv
$ source venv/bin/activate
$ pip install --upgrade pip
$ pip install -r requirements.txt
$ pip install torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.10.0+cu102.html
A more convenient way of running experiments is to instantiate a docker container having CUDA 10.2
already installed.
Make sure you have Docker and NVIDIA Container Toolkit installed on your machine (you may refer to this guide).
Then, you may use the following Docker image to instantiate the container equipped with CUDA 10.2
and cuDNN 8.0
(the environment for PyTorch
): link
After the setup of your Docker containers, you may follow the exact same guidelines as scenario #1.
At ./data/
you may find all the files related to
the datasets, the knowledge graphs and the related item-entity linking.
The datasets could be found within the directory ./data/[DATASET]/data
.
Only for Movielens 1M, within the directory ./data/movielens/grouplens
For the knowledge graphs and links please look at ./data/[DATASET]/dbpedia
.
At ./config_files/
you may find the Elliot configuration files used for setting the experiments.
The configuration files for training the models are reported as [DATASET]_[MODEL].yml
.
While the best models hyperparameters are reported in the files named [DATASET]_best_[MODEL].yml
.
The best models recommendation lists could be found at ./results/recs
.
You may be use them for computing the recommendation metrics as described at Evaluate Recommendation
The following are the parameters required by KGTORe:
batch size
: training batch size;lr
: learning rate;elr
: features embedding learning rate;l_w
: embedding regularization;l_ind
: independence loss weight;alpha
: alpha parameter;beta
: beta parameter;factors
: embeddings dimension;ind_edges
: fraction of edges selected for computing the independence loss in each training batch;n_layers
: graph convolutional network layers;npr
: negative-positive ratio when building the decision tree;epochs
: training epochs
Here we describe the steps to reproduce the results presented in the paper. Furthermore, we provide a description of how the experiments have been configured.
Here you can find a ready-to-run Python file with all the pre-configured experiments cited in our paper. You can easily run them with the following command:
python run.py
It runs the pre-processing procedure and then trains our KGTORe model and all the baselines on three different datasets.
The results will be stored in the folder results/DATASET/
.
If you are interested in running just the data preprocessing step, please run:
python preprocessing.py
For computing the recommendation metrics on the recommendation lists run the following command:
python compute_metrics_on_recs.py
Within the file it is possible to specify the directory containing the recommendation lists.
The default solution is results/recs/[DATASET]
.
Along with the evaluation metrics also the student's paired t-test is computed.
The results statistical significance has been evaluated performing a Student's Paired t-Test. The precomputed results on the best models could be found at results/student_paired_t_test.
The paired t-test is computed during recommendation metrics computation. Please, see the evaluation section to compute them.