/OpenHGNN

This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL.

Primary LanguagePythonApache License 2.0Apache-2.0

OpenHGNN

GitHub release (latest by date) Documentation Status GitHub visitors Total lines

启智社区(中文版) | OpenHGNN [CIKM2022] | Space4HGNN [SIGIR2022] | Benchmark&Leaderboard | Slack Channel

This is an open-source toolkit for Heterogeneous Graph Neural Network based on DGL [Deep Graph Library] and PyTorch. We integrate SOTA models of heterogeneous graph.

News

2024-07-23 release v0.7

We release the latest version v0.7.0

  • New models and datasets.
  • Graph Prompt pipeline
  • Data process frame: dgl.graphBolt
  • New GNN aggregator: dgl.sparse
  • Distributed training
2023-07-17 release v0.5

We release the latest version v0.5.0

  • New models and datasets.
  • 4 New tasks: pretrain, recommendation, graph attacks and defenses, abnorm_event detection.
  • TensorBoard visualization.
  • Maintenance and test module.
2023-02-24 OpenI Excellent Incubation Award

OpenHGNN won the Excellent Incubation Program Award of OpenI Community! For more details:https://mp.weixin.qq.com/s/PpbwEdP0-8wG9dsvRvRDaA

2023-02-21 First Prize of CIE

The algorithm library supports the project of "Intelligent Analysis Technology and Scale Application of Large Scale Complex Heterogeneous Graph Data" led by BUPT and participated by ANT GROUP, China Mobile, Haizhi Technology, etc. This project won the first prize of the 2022 Chinese Intitute of Electronics "Science and Technology Progress Award".

2023-01-13 release v0.4

We release the latest version v0.4.

  • New models
  • Provide pipelines for applications
  • More models supporting mini-batch training
  • Benchmark for million-scale graphs
2022-08-02 paper accepted
Our paper [ OpenHGNN: An Open Source Toolkit for Heterogeneous Graph Neural Network ](https://dl.acm.org/doi/abs/10.1145/3511808.3557664) is accpeted at CIKM 2022 short paper track.
2022-06-27 release v0.3

We release the latest version v0.3.

  • New models
  • API Usage
  • Simply customization of user-defined datasets and models
  • Visualization tools of heterogeneous graphs
2022-02-28 release v0.2

We release the latest version v0.2.

2022-01-07 加入启智社区
启智社区用户可以享受到如下功能:
  • 全新的中文文档
  • 免费的计算资源—— 云脑使用教程
  • OpenHGNN最新功能
    • 新增模型:【KDD2017】Metapath2vec、【TKDE2018】HERec、【KDD2021】HeCo、【KDD2021】SimpleHGN、【TKDE2021】HPN、【ICDM2021】HDE、fastGTN
    • 新增日志功能
    • 新增美团外卖数据集

Key Features

  • Easy-to-Use: OpenHGNN provides easy-to-use interfaces for running experiments with the given models and dataset. Besides, we also integrate optuna to get hyperparameter optimization.
  • Extensibility: User can define customized task/model/dataset to apply new models to new scenarios.
  • Efficiency: The backend dgl provides efficient APIs.

Get Started

Requirements and Installation

  • Python >= 3.6

  • PyTorch >= 2.3.0

  • DGL >= 2.2.1

  • CPU or NVIDIA GPU, Linux, Python3

1. Python environment (Optional): We recommend using Conda package manager

conda create -n openhgnn python=3.6
source activate openhgnn

2. Install Pytorch: Follow their tutorial to run the proper command according to your OS and CUDA version. For example:

pip install torch torchvision torchaudio

3. Install DGL: Follow their tutorial to run the proper command according to your OS and CUDA version. For example:

pip install dgl -f https://data.dgl.ai/wheels/repo.html

4. Install openhgnn:

  • install from pypi
pip install openhgnn
  • install from source
git clone https://github.com/BUPT-GAMMA/OpenHGNN
# If you encounter a network error, try git clone from openi as following.
# git clone https://git.openi.org.cn/GAMMALab/OpenHGNN.git
cd OpenHGNN
pip install .

5. Install gdbi(Optional):

  • install gdbi from git
pip install git+https://github.com/xy-Ji/gdbi.git
  • install graph database from pypi
pip install neo4j==5.16.0
pip install nebula3-python==3.4.0

Running an existing baseline model on an existing benchmark dataset

python main.py -m model_name -d dataset_name -t task_name -g 0 --use_best_config --load_from_pretrained

usage: main.py [-h] [--model MODEL] [--task TASK] [--dataset DATASET] [--gpu GPU] [--use_best_config][--use_database]

optional arguments:

-h, --help show this help message and exit

--model -m name of models

--task -t name of task

--dataset -d name of datasets

--gpu -g controls which gpu you will use. If you do not have gpu, set -g -1.

--use_best_config use_best_config means you can use the best config in the dataset with the model. If you want to set the different hyper-parameter, modify the openhgnn.config.ini manually. The best_config will override the parameter in config.ini.

--load_from_pretrained will load the model from a default checkpoint.

--use_database get dataset from database

---mini_batch_flag train model with mini-batchs

---graphbolt mini-batch training with dgl.graphbolt

---use_distributed train model with distributed way

e.g.:

python main.py -m GTN -d imdb4GTN -t node_classification -g 0 --use_best_config

python main.py -m RGCN -d imdb4GTN -t node_classification -g 0 --mini_batch_flag --graphbolt

Note: If you are interested in some model, you can refer to the below models list.

Refer to the docs to get more basic and depth usage.

Use TensorBoard to visualize your train result

tensorboard --logdir=./openhgnn/output/{model_name}/

e.g.:

tensorboard --logdir=./openhgnn/output/RGCN/

Note: To visualize results, you need to train the model first.

Use gdbi to get grpah dataset

take neo4j and imdb dataset for example

  • construct csv file for dataset(node-level:A.csv,edge-level:A_P.csv)
  • import csv file to database
LOAD CSV WITH HEADERS FROM "file:///data.csv" AS row
CREATE (:graphname_labelname {ID: row.ID, ... });
  • add user information to access database in config.py file
self.graph_address = [graph_address]
self.user_name = [user_name]
self.password = [password]
  • e.g.:
python main.py -m MAGNN -d imdb4MAGNN -t node_classification -g 0 --use_best_config --use_database

Supported Models with specific task

The link will give some basic usage.

Model Node classification Link prediction Recommendation
TransE[NIPS 2013] ✔️
TransH[AAAI 2014] ✔️
TransR[AAAI 2015] ✔️
TransD[ACL 2015] ✔️
Metapath2vec[KDD 2017] ✔️
RGCN[ESWC 2018] ✔️ ✔️
HERec[TKDE 2018] ✔️
HAN[WWW 2019] ✔️ ✔️
KGCN[WWW 2019] ✔️
HetGNN[KDD 2019] ✔️ ✔️
HeGAN[KDD 2019] ✔️
HGAT[EMNLP 2019]
GTN[NeurIPS 2019] & fastGTN ✔️
RSHN[ICDM 2019] ✔️ ✔️
GATNE-T[KDD 2019] ✔️
DMGI[AAAI 2020] ✔️
MAGNN[WWW 2020] ✔️
HGT[WWW 2020] ✔️
CompGCN[ICLR 2020] ✔️ ✔️
NSHE[IJCAI 2020] ✔️
NARS[arxiv] ✔️
MHNF[arxiv] ✔️
HGSL[AAAI 2021] ✔️
HGNN-AC[WWW 2021] ✔️
HeCo[KDD 2021] ✔️
SimpleHGN[KDD 2021] ✔️
HPN[TKDE 2021] ✔️ ✔️
RHGNN[arxiv] ✔️
HDE[ICDM 2021] ✔️
HetSANN[AAAI 2020] ✔️
ieHGCN[TKDE 2021] ✔️
KTN[NIPS 2022] ✔️

Candidate models

Contributors

OpenHGNN Team[GAMMA LAB], DGL Team and Peng Cheng Laboratory.

See more in CONTRIBUTING.

Cite OpenHGNN

If you use OpenHGNN in a scientific publication, we would appreciate citations to the following paper:

@inproceedings{han2022openhgnn,
  title={OpenHGNN: An Open Source Toolkit for Heterogeneous Graph Neural Network},
  author={Hui Han, Tianyu Zhao, Cheng Yang, Hongyi Zhang, Yaoqi Liu, Xiao Wang, Chuan Shi},
  booktitle={CIKM},
  year={2022}
}