/KAEL_WSDM23

Active Ensemble Learning for Knowledge Graph Error Detection

Primary LanguagePython

KAEL_WSDM23

Active Ensemble Learning for Knowledge Graph Error Detection, WSDM2023, Singapore

Framework

We propose a new framwork to effectively combine a set of off-the-shelf KG error detection algorithms with minimum human annotations.
It adaptively updates the ensemble learning policy in each iteration based on active queries as:
KAEL_running
KAEL

Model

  • A Three-Stage Scheme based on the tailored MAB
  1. initialization: Initializes the parameters by prioritizing the overlaps from all base detectors.
  2. train: Trains the tailored MAB within the remaining opportunities.
  3. application: Applies the trained model and parameters for errors within the remaining iterations.

Usage

  • Ranking Files
    Run your chosen base error detectors on raw datasets that contain errors, i.e., triples labeled as 1.
    Generate ranking files according to different scoring functions in different base detectors.
  • Apply TransE to raw dataset and maintain a unified embedding table;
  • Prepare an embedding file for each ranking file, make sure same entity/triple shares the same embedding vector;
  • Check directories;
  • Instantialize the model and define parameters, e.g., limited opportunities in param.py;
  • run main.py.

Reference in BibTex:

@inproceedings{dong2023active,
Title={Active ensemble learning for knowledge graph error detection},
Author={Dong, Junnan and Zhang, Qinggang and Huang, Xiao and Tan, Qiaoyu and Zha, Daochen and Zihao, Zhao},
Booktitle={Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining},
Year={2023}}