/NeuralKG-ind

Primary LanguagePythonApache License 2.0Apache-2.0

Pypi Pypi Documentation

A Python Library for Inductive Knowledge Graph Representation Learning

English | 中文

NeuralKG-ind is a python-based library for inductive knowledge graph representation learning, which includes standardized processes, rich existing methods, decoupled modules, and comprehensive evaluation metrics. We provide comprehensive documents for beginners.


Table of Contents


😃What's New

Feb, 2023

  • We have released a paper NeuralKG-ind: A Python Library for Inductive Knowledge Graph Representation Learning

Overview

NeuralKG-ind is built on PyTorch Lightning and based on NeuralKG. It provides a general workflow for developing models handling inductive tasks on KGs. It has the following features:

  • Standardized process. According to existing methods, we standardized the overall process of constructing an inductive knowledge graph representation learning model, including data processing, sampler and trainer construction, and evaluation of link prediction and triple classification tasks. We also provide auxiliary functions, including log management and hyper-parameter tuning, for model training and analysis.

  • Rich existing methods. We re-implemented 5 inductive knowledge graph representation learning methods proposed in recent 3 years, including GraIL, CoMPILE, SNRI, RMPI and MorsE, enabling users to apply these models off the shelf.

  • Decoupled modules. We provide a lot of decoupled modules, such as the subgraph extraction function, the node labeling function, neighbor aggregation functions, compound graph neural network layers, and KGE score functions, enabling users to construct a new inductive knowledge graph representation learning model faster.

  • Long-term supports. We provide long-term support on NeuralKG-ind, including maintaining detailed documentation, creating straightforward quick-start, adding new models, solving issues, and dealing with pull requests.


Demo

There is a demonstration of NeuralKG-ind.


Implemented Methods

Components Models
KGEModel TransE, TransH, TransR, ComplEx, DistMult, RotatE, ConvE, BoxE, CrossE, SimplE, HAKE, PairRE, DualE
GNNModel RGCN, KBAT, CompGCN, XTransE, GraIL, CoMPILE, SNRI, RMPI, MorsE
RuleModel ComplEx-NNE+AER, RUGE, IterE

Quick Start

Installation

Step1 Create a virtual environment using Anaconda and enter it

conda create -n neuralkg-ind python=3.8
conda activate neuralkg-ind

Step2 Install the appropriate PyTorch and DGL according to your cuda version

Here we give a sample installation based on cuda == 11.1

  • Install PyTorch
pip install torch==1.9.1+cu111 -f https://download.pytorch.org/whl/torch_stable.html
  • Install DGL
pip install dgl-cu111 dglgo -f https://data.dgl.ai/wheels/repo.html

Step3 Install package

git clone https://github.com/zjukg/NeuralKG-ind.git
cd NeuralKG-ind
python setup.py install

Training

# Use bash script
sh ./scripts/your-sh

# Use config
python main.py --load_config --config_path <your-config>

Evaluation

# Testing AUC and AUC-PR 
python main.py --test_only --checkpoint_dir <your-model-path> --eval_task triple_classification 

# Testing MRR and hit@1,5,10
python main.py --test_only --checkpoint_dir <your-model-path> --eval_task link_prediction --test_db_path <your-db-path> 

Hyperparameter Tuning

NeuralKG-ind utilizes Weights&Biases supporting various forms of hyperparameter optimization such as grid search, Random search, and Bayesian optimization. The search type and search space are specified in the configuration file in the format "*.yaml" to perform hyperparameter optimization.

The following config file displays hyperparameter optimization of the Grail on the FB15K-237 dataset using bayes search:

command:
  - ${env}
  - ${interpreter}
  - ${program}
  - ${args}
program: main.py
method: bayes
metric:
  goal: maximize
  name: Eval|auc
parameters:
  dataset_name:
    value: FB15K237
  model_name:
    value: Grail
  loss_name:
    values: [Margin_Loss]
  train_sampler_class:
    values: [SubSampler]
  emb_dim:
    values: [32, 64]
  lr:
    values: [1e-2, 5e-3, 1e-3]
  train_bs:
    values: [64, 128]
  num_neg:
    values: [16, 32]

Reproduced Results

There are some reproduced model results on FB15K-237 dataset and partial results on NELL-995 using NeuralKG as below. See more results in here

MethodFB15K-237_v1FB15K-237_v2
AUCAUC-PRMRRHits@1Hit@5Hit@10AUCAUC-PRMRRHits@1Hit@5Hit@10
GraIL0.8020.8210.4520.3590.5610.6240.8730.9000.6420.5390.7670.831
CoMPILE0.8000.8350.5160.4370.6000.6680.8760.9050.6170.5090.7410.813
SNRI0.7920.8830.4950.3900.6000.7200.8840.9060.6460.5360.7810.857
RMPI0.8030.8230.5320.4510.6200.6890.8510.8820.6320.5230.7630.830
MorsE0.8440.8470.5910.4700.7230.8330.9630.9600.7540.6430.8970.950

MethodFB15K-237_v3FB15K-237_v4
AUCAUC-PRMRRHits@1Hit@5Hit@10AUCAUC-PRMRRHits@1Hit@5Hit@10
GraIL0.8710.8990.6370.5300.7650.8280.9110.9210.6390.5210.7970.880
CoMPILE0.9060.9250.6700.5680.7960.8590.9270.9320.7040.6040.8310.894
SNRI0.8700.8840.6420.5250.7750.8710.8990.9160.6810.5730.8210.894
RMPI0.8760.8660.6620.5690.7670.8270.9050.9160.6470.5350.7870.866
MorsE0.9590.9520.7450.6370.8780.9540.9630.9520.7420.6620.8880.958

MethodNELL-995_v1NELL-995_v2
AUCAUC-PRMRRHits@1Hit@5Hit@10AUCAUC-PRMRRHits@1Hit@5Hit@10
GraIL0.8140.7500.4670.3950.5150.5750.9290.9470.7350.6240.8840.933
SNRI0.7370.7200.5230.4750.5450.5950.8640.8840.6300.5070.7740.863

Notebook Guide

😃We use colab to provide some notebooks to help users use our library.

Colab Notebook


Detailed Documentation

https://zjukg.github.io/NeuralKG/neuralkg.html


NeuralKG-ind Core Team

Zhejiang University: Wen Zhang, Zhen Yao, Mingyang Chen, Zhiwei Huang, Huajun Chen