This repository contains the source code for the Shortest Path Message Passing Neural Network (SP-MPNN) framework
models, presented in the LoG 2022 paper "Shortest Path Networks for Graph
Property Prediction". The repository includes the h-Prox
datasets, as well as all evaluation datasets and code for
training and testing the different SP-MPNN variants across all experiments in the paper.
The requirements for the Python environment can be found in requirements.txt
. The
main packages that have been used are PyTorch, PyTorch Geometric, OGB (Open Graph Benchmark),
Neptune (neptune.ai) and Numpy.
Running on a GPU is supported via CUDA.
Available in Pytorch Geometric.
The synthetic Prox datasets (k = {1, 3, 5, 8, 10}
) are available here.
To use these datasets, the contents of the aforementioned zip file must be extracted and moved into the data
directory
We use the OGB implementation of the MoleculeNet datasets. More information can be found here.
The QM9 dataset is provided in data/QM9
.
The script we use to run the experiments is src/main.py
. Note that the script should be run from inside
the src
directory, or mark it as Source Root.
The main parameters of the script are:
--dataset
the dataset we use.--model
for the model used. The main convention isSP-{INNER}-{OUTER}
, whereINNER
corresponds to the approach we use for aggregating the embeddings on each hop level, whileOUTER
is the approach we use for aggregating the different hop levels. The main models we use areSP-SUM-WEIGHT
,SP-EDGESUM-WEIGHT
andSP-RSUM-WEIGHT
, where the commonWEIGHT
outer aggregation is the normalised sum that the simple model in the paper uses (SPN).--mode
for the current task type. We usegc
for Graph Classification, andgr
for Graph Regression.
Additionally, some of the more useful configurable parameters are:
--emb_dim
for the embedding dimensionality.--batch_size
for the batch size we use during training.--lr
for the learning rate.--dropout
for the dropout probability.--epochs
for the number of epochs.
A detailed list of all additional arguments can be seen using the following command:
python main.py -h
You can run the end-to-end experiments for the datasets from each class using the commands below, where the arguments are replaced with:
{dataset}
is the dataset we want to run the experiment on{k}
is the maximum distance for a SP layer{L}
is the number of layers{QM9_TASK}
is an integer id for the QM9 objective / task we want to predict
python main.py -d {dataset} -m SP-SUM-WEIGHT --max_distance {k} --num_layers {L} --mode gc
python main.py -d {dataset} -m SP-EDGESUM-WEIGHT --max_distance {k} --num_layers {L} --mode gc
python main.py -d QM9 -m SP-RSUM-WEIGHT --max_distance {k} --num_layers {L} --specific_task {QM9_TASK} --mode gr
You can use neptune.ai
to track the progress, by specifying your project and token in src/config.ini
. Leave the
fields as ...
if you want to just run locally.
If you make use of this code, or its accompanying paper, please cite this work as follows:
@inproceedings{ADC-LoG2022,
title={Shortest Path Networks for Graph Property Prediction},
author = {Ralph Abboud and Radoslav Dimitrov and
{\.I}smail {\.I}lkan Ceylan},
booktitle={Proceedings of the First Learning on Graphs Conference ({LoG})},
year={2022},
note={Oral presentation}
}