We provide the implementaion of MI-GNN model, which is the source code for the SIGIR 2021 paper "Meta-Inductive Node Classification across Graphs". https://dl.acm.org/doi/pdf/10.1145/3404835.3462915
The repository is organised as follows:
- dataset/: contains 5 benchmark datasets: Flickr, Yelp, Cuneiform, COX2 and DHFR. Note: all the five datasets are processed datasets and we can directly use them once we download all this repository.
- sub_data.py: The code to create social sub-graph data "Sub_Flickr" or "Sub_Yelp", 800 small ego-networks, extracted from an online image sharing social network and online user reviews network. Note that if we want to use "Sub_Flickr" or "Sub_Yelp" as the dataset and we don't have the 'data' directory(the processed data), we need to run this file before running main.py. And if we want "Sub_Flickr", we need to type the string 'Flickr' as the input, if we want "Sub_Yelp", we need to type the string 'Yelp' as the input.
- tudata.py, mydataset.py : Data preprocessing for Cuneiform, COX2 and DHFR.
- main.py: The main entrance of the model. You can change dataset name, l2 coefficient(note that for Sub_Flickr the l2_coef =1, and for other datasets, the l2_coef = 0.001), task_lr(for Cuneiform, Sub_Flickr, Sub_Yelp, 0.5, others, 0.005)in line 250--260.
- models/: contains our model.
- learner_1.py: The two gnn layers of SGC, having neighboring aggregation, because we have not done that before.
- learner_2.py: The two gnn layers of SGC, no neighboring aggregation, because we have done that before.
- chemical.py, scaling_sgc.py, translation_sgc.py: About the scaling and shifting transformation
- earlystopping.py: The earlystopping function
To install requirements:
pip install -r requirements.txt
We use 5 benchmark datasets: Flickr, Yelp, Cuneiform, COX2 and DHFR.And you can also prepare your own datasets. The data format should be as follows,
- (1) For each small graph, the data should be the form of torch_geometric.data.Data. You can refer to this: https://pytorch-geometric.readthedocs.io/en/latest/modules/data.html#
- (2) For the whole dataset of graphs, it can be the form of python list, i.e., "[ ]".
To train and test the model in the paper, note that:
(1)the following single file includes all the data split into training and testsing set;
(2)we just need to run this single file and all things can be done, including data preprocessing, training and testing:
python main.py
title = {Meta-Inductive Node Classification across Graphs},
author = {Wen, Zhihao and Fang, Yuan and Liu, Zemin},
booktitle = {Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval},
year = {2021}