This DGL example implements the paper: GraphSAINT: Graph Sampling Based Inductive Learning Method.
Paper link: https://arxiv.org/abs/1907.04931
Author's code: https://github.com/GraphSAINT/GraphSAINT
Contributor: Liu Tang (@lt610)
- Python 3.7.0
- PyTorch 1.6.0
- NumPy 1.19.2
- Scikit-learn 0.23.2
- DGL 0.5.3
All datasets used are provided by Author's code. They are available in Google Drive (alternatively, Baidu Wangpan (code: f1ao)). Once you download the datasets, you need to rename graphsaintdata to data. Dataset summary("m" stands for multi-label classification, and "s" for single-label.):
Dataset |
Nodes |
Edges |
Degree |
Feature |
Classes |
Train/Val/Test |
PPI |
14,755 |
225,270 |
15 |
50 |
121(m) |
0.66/0.12/0.22 |
Flickr |
89,250 |
899,756 |
10 |
500 |
7(s) |
0.50/0.25/0.25 |
Reddit |
232,965 |
11,606,919 |
50 |
602 |
41(s) |
0.66/0.10/0.24 |
Yelp |
716,847 |
6,877,410 |
10 |
300 |
100(m) |
0.75/0.10/0.15 |
Amazon |
1,598,960 |
132,169,734 |
83 |
200 |
107(m) |
0.85/0.05/0.10 |
Run with following:
python train_sampling.py --gpu 0 --dataset ppi --sampler node --node-budget 6000 --num-repeat 50 --n-epochs 1000 --n-hidden 512 --arch 1-0-1-0
python train_sampling.py --gpu 0 --dataset ppi --sampler edge --edge-budget 4000 --num-repeat 50 --n-epochs 1000 --n-hidden 512 --arch 1-0-1-0 --dropout 0.1
python train_sampling.py --gpu 0 --dataset ppi --sampler rw --num-roots 3000 --length 2 --num-repeat 50 --n-epochs 1000 --n-hidden 512 --arch 1-0-1-0 --dropout 0.1
python train_sampling.py --gpu 0 --dataset flickr --sampler node --node-budget 8000 --num-repeat 25 --n-epochs 30 --n-hidden 256 --arch 1-1-0 --dropout 0.2
python train_sampling.py --gpu 0 --dataset flickr --sampler edge --edge-budget 6000 --num-repeat 25 --n-epochs 15 --n-hidden 256 --arch 1-1-0 --dropout 0.2
python train_sampling.py --gpu 0 --dataset flickr --sampler rw --num-roots 6000 --length 2 --num-repeat 25 --n-epochs 15 --n-hidden 256 --arch 1-1-0 --dropout 0.2
python train_sampling.py --gpu 0 --dataset reddit --sampler node --node-budget 8000 --num-repeat 50 --n-epochs 40 --n-hidden 128 --arch 1-0-1-0 --dropout 0.1
python train_sampling.py --gpu 0 --dataset reddit --sampler edge --edge-budget 6000 --num-repeat 50 --n-epochs 40 --n-hidden 128 --arch 1-0-1-0 --dropout 0.1
python train_sampling.py --gpu 0 --dataset reddit --sampler rw --num-roots 2000 --length 4 --num-repeat 50 --n-epochs 30 --n-hidden 128 --arch 1-0-1-0 --dropout 0.1
python train_sampling.py --gpu 0 --dataset yelp --sampler node --node-budget 5000 --num-repeat 50 --n-epochs 50 --n-hidden 512 --arch 1-1-0 --dropout 0.1
python train_sampling.py --gpu 0 --dataset yelp --sampler edge --edge-budget 2500 --num-repeat 50 --n-epochs 100 --n-hidden 512 --arch 1-1-0 --dropout 0.1
python train_sampling.py --gpu 0 --dataset yelp --sampler rw --num-roots 1250 --length 2 --num-repeat 50 --n-epochs 75 --n-hidden 512 --arch 1-1-0 --dropout 0.1
python train_sampling.py --gpu 0 --dataset amazon --sampler node --node-budget 4500 --num-repeat 50 --n-epochs 30 --n-hidden 512 --arch 1-1-0 --dropout 0.1
python train_sampling.py --gpu 0 --dataset amazon --sampler edge --edge-budget 2000 --num-repeat 50 --n-epochs 30 --n-hidden 512 --arch 1-1-0 --dropout 0.1
python train_sampling.py --gpu 0 --dataset amazon --sampler rw --num-roots 1500 --length 2 --num-repeat 50 --n-epochs 30 --n-hidden 512 --arch 1-1-0 --dropout 0.1
- Paper: results from the paper
- Running: results from experiments with the authors' code
- DGL: results from experiments with the DGL example
Method |
PPI |
Flickr |
Reddit |
Yelp |
Amazon |
Paper |
0.960±0.001 |
0.507±0.001 |
0.962±0.001 |
0.641±0.000 |
0.782±0.004 |
Running |
0.9628 |
0.5077 |
0.9622 |
0.6393 |
0.7695 |
DGL |
0.9618 |
0.4828 |
0.9621 |
0.6360 |
0.7748 |
Method |
PPI |
Flickr |
Reddit |
Yelp |
Amazon |
Paper |
0.981±0.007 |
0.510±0.002 |
0.966±0.001 |
0.653±0.003 |
0.807±0.001 |
Running |
0.9810 |
0.5066 |
0.9656 |
0.6531 |
0.8071 |
DGL |
0.9818 |
0.5054 |
0.9653 |
0.6517 |
exceed |
Method |
PPI |
Flickr |
Reddit |
Yelp |
Amazon |
Paper |
0.981±0.004 |
0.511±0.001 |
0.966±0.001 |
0.653±0.003 |
0.815±0.001 |
Running |
0.9812 |
0.5104 |
0.9648 |
0.6527 |
0.8131 |
DGL |
0.9818 |
0.5018 |
0.9649 |
0.6516 |
0.8150 |
Method |
PPI |
Flickr |
Reddit |
Yelp |
Amazon |
Sampling(Running) |
0.77 |
0.65 |
7.46 |
26.29 |
571.42 |
Sampling(DGL) |
0.24 |
0.57 |
5.06 |
30.04 |
163.75 |
Normalization(Running) |
0.69 |
2.84 |
11.54 |
32.72 |
407.20 |
Normalization(DGL) |
1.04 |
0.41 |
21.05 |
68.63 |
2006.94 |
Method |
PPI |
Flickr |
Reddit |
Yelp |
Amazon |
Sampling(Running) |
0.72 |
0.56 |
4.46 |
12.38 |
101.76 |
Sampling(DGL) |
0.50 |
0.72 |
53.88 |
254.63 |
exceed |
Normalization(Running) |
0.68 |
2.62 |
9.42 |
26.64 |
62.59 |
Normalization(DGL) |
0.61 |
0.38 |
14.69 |
23.63 |
exceed |
Method |
PPI |
Flickr |
Reddit |
Yelp |
Amazon |
Sampling(Running) |
0.83 |
1.22 |
6.69 |
18.84 |
209.83 |
Sampling(DGL) |
0.28 |
0.63 |
4.02 |
22.01 |
55.09 |
Normalization(Running) |
0.87 |
2.60 |
10.28 |
24.41 |
145.85 |
Normalization(DGL) |
0.70 |
0.42 |
18.34 |
32.16 |
683.96 |