Active Ensemble Learning for Knowledge Graph Error Detection, WSDM2023, Singapore
We propose a new framwork to effectively combine a set of off-the-shelf KG error detection algorithms with minimum human annotations.
It adaptively updates the ensemble learning policy in each iteration based on active queries as:
- A Three-Stage Scheme based on the tailored MAB
initialization
: Initializes the parameters by prioritizing the overlaps from all base detectors.train
: Trains the tailored MAB within the remaining opportunities.application
: Applies the trained model and parameters for errors within the remaining iterations.
- Ranking Files
Run your chosen base error detectors on raw datasets that contain errors, i.e., triples labeled as 1.
Generate ranking files according to different scoring functions in different base detectors. - Apply TransE to raw dataset and maintain a unified embedding table;
- Prepare an embedding file for each ranking file, make sure same entity/triple shares the same embedding vector;
- Check directories;
- Instantialize the model and define parameters, e.g., limited opportunities in
param.py
; - run main.py.
@inproceedings{dong2023active,
Title={Active ensemble learning for knowledge graph error detection},
Author={Dong, Junnan and Zhang, Qinggang and Huang, Xiao and Tan, Qiaoyu and Zha, Daochen and Zihao, Zhao},
Booktitle={Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining},
Year={2023}}