This is the repository of our article RecSys 2019 "Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches". The full text of the article is available here, source code of our experiments and full results are available here.
We are still actively pursuing this research direction in evaluation and reproducibility, we are open to collaboration with other reseachers. Follow our project on ResearchGate!
Please cite our article if you use this repository or our implementations of baseline algorithms, remember also to cite the original authors if you use our porting of the DL algorithms.
@Article{Ferraridacrema2019,
author={Ferrari Dacrema, Maurizio
and Cremonesi, Paolo
and Jannach, Dietmar},
title={Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches},
journal={Proceedings of the 13th ACM Conference on Recommender Systems (RecSys 2019)},
year={2019},
doi={10.1145/3298689.3347058},
Eprint={arXiv:1907.06902},
note={Source: \url{https://github.com/MaurizioFD/RecSys2019_DeepLearning_Evaluation}},
}
The full results and corresponding hyperparameters for all DL algorithms are accessible HERE. For information on the requirements and how to install this repository, see the following Installation section.
This repository is organized in several subfolders.
The Deep Learning algorithms are all contained in the Conferences folder and further divided in the conferences they were published in. For each DL algorithm the repository contains two subfolders:
- A folder named "_github" which contains the full original repository, with the minor fixes needed for the code to run.
- A folder named "_our_interface" which contains the python wrappers needed to allow its testing in our framework. The main class for that algorithm has the "Wrapper" suffix in its name. This folder also contain the functions needed to read and split the data in the appropriate way.
Note that in some cases the original repository contained also the data split used by the original authors, that is included as well.
Other folders like KNN and GraphBased contain all the baseline algorithms we have used in our experiments.
The folder Base.Evaluation contains the two evaluator objects (EvaluatorHoldout, EvaluatorNegativeSample) which compute all the metrics we report.
The data to be used for each experiments is gathered from specific DataReader objects withing each DL algoritm's folder. Those will load the original data split, if available. If not, automatically download the dataset and perform the split with the appropriate methodology. If the dataset cannot be downloaded automatically, a console message will display the link at which the dataset can be manually downloaded and instructions on where the user should save the compressed file.
The folder Data_manager contains a number of DataReader objects each associated to a specific dataset, which are used to read datasets for which we did not have the original split.
Whenever a new dataset is parsed, the preprocessed data is saved in a new folder called Data_manager_split_datasets, which contains a subfolder for each dataset and then a subfolder for each conference.
Folder ParameterTuning contains all the code required to tune the hyperparameters of the baselines. The script run_parameter_search contains the fixed hyperparameters search space used in all our experiments. The object SearchBayesianSkopt does the hyperparameter optimization for a given recommender instance and hyperparameter space, saving the explored configuration and corresponding recommendation quality.
See see the following Installation section for information on how to install this repository. After the installation is complete you can run the experiments.
All experiments related to a DL algorithm reported in our paper can be executed by running the corresponding script, which is preceeded by run_, the conference name and the year of publication. For example, if you want to run the experiments for SpectralCF, you should run this command:
python run_RecSys_18_SpectralCF.py
The script will:
- Load and split the data.
- Run the bayesian hyperparameter optimization on all baselines, saving the best values found.
- Run the fit and test of the DL algorithm
- Create the latex code of the result tables, as well as plot the data splits, when required.
- The results can be accessed in the result_experiments folder.
Note that this repository requires Python 3.6
First we suggest you create an environment for this project using virtualenv (or another tool like conda)
First checkout this repository, then enter in the repository folder and run this commands to create and activate a new environment:
If you are using virtualenv:
virtualenv -p python3 DLevaluation
source DLevaluation/bin/activate
If you are using conda:
conda create -n DLevaluation python=3.6 anaconda
source activate DLevaluation
Then install all the requirements and dependencies
pip install -r requirements.txt
In order to compile you must have installed: gcc and python3 dev, which can be installed with the following commands:
sudo apt install gcc
sudo apt-get install python3-dev
At this point you can compile all Cython algorithms by running the following command. The script will compile within the current active environment. The code has been developed for Linux and Windows platforms. During the compilation you may see some warnings.
python run_compile_all_cython.py
In addition to the repository dependencies, KDD CollaborativeDL also requires the Matlab engine, due to the fact that the algorithm is developed in Matlab. To install the engine you can use a script provided directly with your Matlab distribution, as described in the Matlab Documentation. The algorithm requires also a GSL distribution, whose installation folder can be provided as a parameter in the fit function of our Python wrapper. Please refer to the original CollaborativeDL README for all installation details.