Augment to Interpret
This repository contains the official implementation to reproduce the results of the paper Augment to Interpret: Unsupervised and Inherently Interpretable Graph Embeddings, by G. Scafarto, M. Ciortan, S. Tihon and Q. Ferré.
How to install
This repository uses Docker to run. Install Docker if necessary.
If your computer has enough RAM to handle the Mutag dataset, change the LOW_RESOURCES
constant
in tests/utils/constants.py from True
to False
.
Then, in a terminal located at the root of the repository, enter the following commands:
make docker_build # create the docker image
make docker_run # for gpu, run the docker image
# make docker_run_cpu # for cpu, run the docker image
make download_data # download Mutag and MNIST datasets
pytest tests # download and/or create all the datasets with the same seed as the authors
# It also tests the installation.
How to run
To reproduce all our results at once, you can enter the command make run_all
in the running docker container. This may take weeks.
To run an individual experiment, you can use the following command:
python scripts/main.py --help # display some help about arguments to the function
python scripts/main.py \
--dataset ba_2motifs \
--loss simclr_double_aug_info_negative \
--seed 0 \
--model_name GIN \
--epochs 150 \
# etc. # train and save one model
make run_post_hoc # run the analysis of the results
Structure of the repository
The repository is structured as follows:
augment_to_interpret
contains most of the code, coded as a library. It can be installed usingpip
thanks to the filesetup.py
. This will however not install the dependencies. We recommend using Docker to run our code, as described earlier.scripts
contains all the entry-points to the code.tests
contains Pytest tests (and corresponding files) to ensure the validity of the code. The tests do not cover all functions of the code, but should be enough to detect most installation problems.external_src
contains external repositories that have been integrated in our pipeline. These repositories have been slightly modified to ease integration.files
contains most non-code files, such as data files, config files, result files, and so on.