The code is written in Python 3. We use PyTorch 1.5.1 with CUDA 10.1 to train our models.
The dependencies of our code are:
- torch (https://pytorch.org/get-started/locally/)
- torch_geometric (https://github.com/rusty1s/pytorch_geometric)
- dgl (https://github.com/dmlc/dgl)
- ogb (https://github.com/snap-stanford/ogb)
- wandb (https://github.com/wandb/client)
- localgraphclustering (https://github.com/kfoynt/LocalGraphClustering)
Please follow the instructions to install these dependencies.
Use the following command to run local clustering for all nodes and save the data:
python preprocess.py --task {task} --dataset {name} --ego_size {ego_size}
The preprocessed data will be saved into ./data/{name}-lc-ego-graphs-{ego_size}.npy
and ./data/{name}-lc-conds-{ego_size}.npy
where task is chosen from [saint
, ogbn
], name is chosen from [flickr
, reddit
, yelp
, amazon
, products
, papers100M
], ego_size denotes the maximum cluster size.
Our code supports four inductive datasets including flickr, reddit, yelp, amazon and two OGB datasets including products, papers100M.
To run LCGNN-Transformer, use:
python train.py --dataset flickr|reddit|yelp|amazon|products|papers100M
To run LCGNN-GCN/GAT/SAGE, use:
python train_gnn.py --dataset flickr|reddit|yelp|amazon|products --model gcn/gat/sage
You can find the best configuration in the best_configs.py
.