This is the source code of WSDM'23 paper "GOOD-D: On Unsupervised Graph Out-Of-Distribution Detection".
This code requires the following:
- Python==3.9
- Pytorch==1.11.0
- Pytorch Geometric==2.0.4
- Numpy==1.21.2
- Scikit-learn==1.0.2
- OGB==1.3.3
- NetworkX==2.7.1
- FAISS-GPU==1.7.2
Just run the script corresponding to the experiment and dataset you want. For instance:
- Run out-of-distribution detection on BZR (ID) and COX2 (OOD) datasets:
bash script/oodd_BZR+COX2.sh
- Run anomaly detection on PROTEINS_full datasets:
bash script/ad_PROTEINS_full.sh
The statistic of each dataset pair in our benchmark is provided as follows.
ID dataset | OOD dataset | |||||||
No. | Name | # Graph (Train/Test) | # Node (avg.) | # Edge (avg.) |
Name | # Graph (Test) | # Node (avg.) | # Edge (avg.) |
1 | BZR | 364/41 | 35.8 | 38.4 | COX2 | 41 | 41.2 | 43.5 |
2 | PTC-MR | 309/35 | 14.3 | 14.7 | MUTAG | 35 | 17.9 | 19.8 |
3 | AIDS | 1,800/200 | 15.7 | 16.2 | DHFR | 200 | 42.4 | 44.5 |
4 | ENZYMES | 540/60 | 32.6 | 62.1 | PROTEIN | 60 | 39.1 | 72.8 |
5 | IMDB-B | 1,350/150 | 19.8 | 96.5 | IMDB-M | 150 | 13.0 | 65.9 |
6 | Tox21 | 7,047/784 | 18.6 | 19.3 | SIDER | 784 | 33.6 | 35.4 |
7 | FreeSolv | 577/65 | 8.7 | 8.4 | ToxCast | 65 | 18.8 | 19.3 |
8 | BBBP | 1,835/204 | 24.1 | 26.0 | BACE | 204 | 34.1 | 36.9 |
9 | ClinTox | 1,329/148 | 26.2 | 27.9 | LIPO | 148 | 27.0 | 29.5 |
10 | Esol | 1,015/113 | 13.3 | 13.7 | MUV | 113 | 24.2 | 26.3 |
The statistic of each dataset in the anomaly detection experiments is provided as follows.
Dataset | # Graph (Train/Test) | # Node (avg.) | # Edge (avg.) |
PROTEINS-full | 360/223 | 39.1 | 72.8 |
ENZYMES | 400/120 | 32.6 | 62.1 |
AIDS | 1280/400 | 15.7 | 16.2 |
DHFR | 368/152 | 42.4 | 44.5 |
BZR | 69/81 | 35.8 | 38.4 |
COX2 | 81/94 | 41.2 | 43.5 |
DD | 390/236 | 284.3 | 715.7 |
NCI1 | 1646/822 | 29.8 | 32.3 |
IMDB-B | 400/200 | 19.8 | 96.5 |
REDDIT-B | 800/400 | 429.6 | 497.8 |
COLLAB | 1920/1000 | 74.5 | 2457.8 |
HSE | 423/267 | 16.9 | 17.2 |
MMP | 6170/238 | 17.6 | 18.0 |
p53 | 8088/269 | 17.9 | 18.3 |
PPAR-gamma | 219/267 | 17.4 | 17.7 |
For the sake of efficiency, we set the structural encoding dimensions
We conduct the experiments on a Linux server with an Intel Xeon Gold 6226R CPU and two Tesla V100S GPUs. We implement our method with PyTorch 1.11.0 and Pytorch Geometric 2.0.4.
If you compare with, build on, or use aspects of this work, please cite the following:
@inproceedings{liu2023goodd,
title={GOOD-D: On Unsupervised Graph Out-Of-Distribution Detection},
author={Liu, Yixin and Ding, Kaize and Liu, Huan and Pan, Shirui},
booktitle={Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining},
year={2023}
}