This is the official repository for the paper "Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis", accepted at RecSys 2023 (Reproducibility Track).
This repository is heavily dependent on the framework Elliot, so we suggest you refer to the official GitHub page and documentation.
We implemented and tested our models in PyTorch==1.12.0
, with CUDA 10.2
and cuDNN 8.0
. Additionally, some of graph-based models require PyTorch Geometric
, which is compatible with the versions of CUDA and PyTorch
we indicated above.
If you have the possibility to install CUDA on your workstation (i.e., 10.2
), you may create the virtual environment with the requirement files we included in the repository, as follows:
# PYTORCH ENVIRONMENT (CUDA 10.2, cuDNN 8.0)
$ python3.8 -m venv venv
$ source venv/bin/activate
$ pip install --upgrade pip
$ pip install -r requirements.txt
$ pip install -r requirements_torch_geometric.txt
A more convenient way of running experiments is to instantiate a docker container having CUDA 10.2
already installed. Make sure you have Docker and NVIDIA Container Toolkit installed on your machine (you may refer to this guide). Then, you may use the following Docker image to instantiate the container equipped with CUDA 10.2
and cuDNN 8.0
: link.
We used Gowalla, Yelp 2018, and Amazon Book datasets. The original links may be found here, where the train/test splitting has already been provided:
- Gowalla: https://github.com/xiangwang1223/neural_graph_collaborative_filtering/tree/master/Data/gowalla
- Yelp 2018: https://github.com/kuandeng/LightGCN/tree/master/Data/yelp2018
- Amazon Book: https://github.com/xiangwang1223/neural_graph_collaborative_filtering/tree/master/Data/amazon-book
After downloading, create three folders ./data/{dataset_name}
, one for each dataset. Then, run the script ./map_dataset.py
, by changing the name of the dataset within the script itself. It will generate the train/test files for each dataset in a format compatible for Elliot (i.e., tsv file with three columns referring to user/item).
In case, we also provide the final tsv files for all the datasets in this repo.
We directly provide the train/validation/test splittings for Allrecipes and BookCrossing in this repo. As already stated for Gowalla, Yelp 2018, and Amazon Book, create one folder for each dataset in ./data/{dataset_name}
.
To reproduce the results reported in Table 3, run the following:
$ CUBLAS_WORKSPACE_CONFIG=:4096:8 python3.8 -u start_experiments.py \
$ --dataset {dataset_name} \
$ --model {model_name}
Note that CUBLAS_WORKSPACE_CONFIG=:4096:8
(which may change depending on your configuration) is needed to ensure the complete reproducibility of the experiments (otherwise, PyTorch may run some operations in their non-deterministic version).
The following table provides links to the specific configuration of hyper-parameters we adopted for each graph-based model (derived from the original papers and/or the codes):
Gowalla | Yelp 2018 | Amazon Book | |
---|---|---|---|
NGCF | link | link | link |
DGCF | link | link | link |
LightGCN | link | link | link |
SGL | --- | link | link |
UltraGCN | link | link | link |
GFCF | link | link | link |
In addition to the graph-based models from above, we train and test four classic (and strong) CF baselines. We also provide pointers to their configuration files with the exploration of hyper-parameters, which can be used to reproduce Table 4. We recall that EASER configuration file is not provided at submission time for Amazon Book due to heavy computational costs.
Gowalla | Yelp 2018 | Amazon Book | |
---|---|---|---|
MostPop | link | link | link |
Random | link | link | link |
UserkNN | link | link | link |
ItemkNN | link | link | link |
RP3Beta | link | link | link |
EASER | link | link | --- |
The best hyper-parameters for each classic CF model (as found in our experiments) are reported in the following:
-
Gowalla
- UserkNN:
neighbors: 146.0, similarity: 'cosine'
- ItemkNN:
neighbors: 508.0, similarity: 'dot'
- Rp3Beta:
neighborhood: 777.0, alpha: 0.5663562161452378, beta: 0.001085447926739258, normalize_similarity: True
- EASER:
l2_norm: 15.930101258108873
- UserkNN:
-
Yelp 2018
- UserkNN:
neighbors: 146.0, similarity: 'cosine'
- ItemkNN:
neighbors: 144.0, similarity: 'cosine'
- Rp3Beta:
neighborhood: 342.0, alpha: 0.7681732734954694, beta: 0.4181395996963926, normalize_similarity: True
- EASER:
l2_norm: 212.98774633994572
- UserkNN:
-
Amazon Book
- UserkNN:
neighbors: 146.0, similarity: 'cosine'
- ItemkNN:
neighbors: 125.0, similarity: 'cosine'
- Rp3Beta:
neighborhood: 496.0, alpha: 0.44477903655656115, beta: 0.5968193614337285, normalize_similarity: True
- EASER: N.A.
- UserkNN:
We report the configuration files (with hyper-parameter search spaces) for each model/dataset pair for classic + graph CF baselines and Allrecipes and BookCrossing.
Allrecipes | BookCrossing | |
---|---|---|
MostPop | link | link |
Random | link | link |
UserkNN | link | link |
ItemkNN | link | link |
RP3Beta | link | link |
EASER | link | link |
NGCF | link | link |
DGCF | link | link |
LightGCN | link | link |
SGL | link | link |
UltraGCN | link | link |
GFCF | link | link |
The best hyper-parameters for each classic + graph CF model (as found in our experiments) are reported in the following:
-
Allrecipes
- UserkNN:
neighbors: 863.0, similarity: 'cosine'
- ItemkNN:
neighbors: 508.0, similarity: 'dot'
- RP3Beta:
neighborhood: 777.0, alpha: 0.5663562161452378, beta: 0.001085447926739258, normalize_similarity: True
- EASER:
l2_norm: 555344.9240485814
- NGCF:
lr: 0.0010492631473907471, epochs: 100, factors: 64, batch_size: 128, l_w: 0.08623551848300251, n_layers: 1, weight_size: 64, node_dropout: 0.5704755544541924, message_dropout: 0.37665593943318876, normalize: True
- DGCF:
lr: 0.000313132757493385, epochs: 10, factors: 64, batch_size: 256, l_w_bpr: 3.3519512293075625e-05, l_w_ind: 0.00021537560246909769, n_layers: 2, routing_iterations: 2, intents: 4
- LightGCN:
lr: 0.001, epochs: 10, factors: 64, batch_size: 256, l_w: 0.001288395174690605, n_layers: 4, normalize: True
- SGL:
lr: 0.001, epochs: 10, batch_size: 128, factors: 64, l_w: 1e-4, n_layers: 3, ssl_temp: 0.6492261261178492, ssl_reg: 0.012429441724966553, ssl_ratio: 0.2618285305261178492, sampling: nd
- UltraGCN:
lr: 1e-4, epochs: 240, factors: 64, batch_size: 128, g: 1e-4, l: 0.6421380210212072, w1: 0.026431283275666788, w2: 0.0006086626045670742, w3: 2.3712235041563928e-07, w4: 0.03156224646525972, ii_n_n: 10, n_n: 300, n_w: 300, s_s_p: False, i_w: 1e-4
- GFCF:
svd_factors: 256, alpha: 0.5477395514607551
- UserkNN:
-
BookCrossing
- UserkNN:
neighbors: 360.0, similarity: 'cosine'
- ItemkNN:
neighbors: 125.0, similarity: 'cosine'
- RP3Beta:
neighborhood: 777.0, alpha: 0.5663562161452378, beta: 0.001085447926739258, normalize_similarity: True
- EASER:
l2_norm: 97.97026620421359
- NGCF:
lr: 0.001313040990458504, epochs: 150, factors: 64, batch_size: 128, l_w: 0.007471352712353916, n_layers: 1, weight_size: 64, node_dropout: 0.6222126221705062, message_dropout: 0.2768938386628866, normalize: True
- DGCF:
lr: 0.00033659666428326467, epochs: 112, factors: 64, batch_size: 1024, l_w_bpr: 0.0005015002430942853, l_w_ind: 1.0625908485203885e-05, n_layers: 1, routing_iterations: 2, intents: 4
- LightGCN:
lr: 0.001, epochs: 160, factors: 64, batch_size: 256, l_w: 0.00128839517469060, n_layers: 4, normalize: True
- SGL:
lr: 0.001, epochs: 10, batch_size: 128, factors: 64, l_w: 1e-4, n_layers: 4, ssl_temp: 0.3831504020789032, ssl_reg: 0.14847461762325737, ssl_ratio: 0.18119634034037221, sampling: rw
- UltraGCN:
lr: 1e-4, epochs: 205, factors: 64, batch_size: 128, g: 1e-4, l: 2.1590977284940767, w1: 0.4071845141372458, w2: 2.674735729193082e-06, w3: 0.11655266791027195, w4: 0.05001575677944944, ii_n_n: 10, n_n: 300, n_w: 300, s_s_p: False, i_w: 1e-4
- GFCF:
svd_factors: 64, alpha: 0.4240013631942601
- UserkNN:
For RQ4, you need to generate the tsv files where each user from the training set is assigned one of the four quartiles. To do so, run the script ./quartiles_characteristics.py
, by changing the name of the dataset inside the script accordingly. This will create (for each dataset) 3 tsv files, one for each hop (i.e., 1-hop, 2-hop, 3-hop). In case, we directly provide such files for your convenience in the same folders of Allrecipes and BookCrossing (see above).
Once all models have been trained, and tsv files for the user groups have been downloaded and correctly placed, you may want to generate the recommendation lists ONLY for the best hyper-parameter configuration for each model/dataset pair. This is done by setting the parameter save_recs: True
in each configuration file.
Now, we are all set to calculate the nDCG on each user group. To do so, run the following script:
$ python3.8 -u start_experiments_user_groups.py \
$ --dataset {dataset_name} \
$ --hop {hop_number}