Source code for NAACL 2019 paper "Fact Discovery from Knowledge Base via Facet Decomposition".

Primary LanguagePython

Fact Discovery from Knowledge Base via Facet Decomposition

This repo contains the source code and dataset for the following paper: Fact Discovery from Knowledge Base via Facet Decomposition. Zihao Fu, Yankai Lin, Zhiyuan Liu and Wai Lam. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2019). PDF

How to use our code for FFD


  • g++ 7.3.0
  • Python 2.7.16

All the codes are tested under Ubuntu 18.04.1 LTS.


We made a new dataset based on FB15k, it was already in ANALOGY/FB15k. The dataset structure is as follows:

├── p0.5-entities.txt
├── p0.5-relations.txt 
├── p0.5-test.txt
├── p0.5-train.txt
├── p0.5-valid.txt
├── entity2id.txt
└── relation2id.txt

in which, p0.5-train.txt, p0.5-test.txt, p0.5-valid.txt are the tain, test and valid set respectively. Each line contains the head, relation and tail of a fact. p0.5-entities.txt and p0.5-relations.txt contains the name of all entities and relations. entity2id.txt and relation2id.txt contain ids for entities and relations.


  1. Clone
git clone https://github.com/fuzihaofzh/FFD.git
  1. Compile & Install
cd ..
pip install -r requirements.txt
  1. Train Entity-relation Facet Component & Tail Inference Facet Component
python facts_discovery.py run --inputTag p0.5 --cudaId 0 --step trainCorNet
ANALOGY/main -algorithm Analogy -model_path output/Analogy_FB15k_p0.5.model -dataset ANALOGY/FB15k/p0.5 -num_thread 8

in facts_discovery.py, inputTag is the dataset prefix. cudaId is the gpu divice id. step is the running mode. in ANALOGY/main, model_path is the place to save the model. dataset is the dataset path and num_thread is the thread number.

  1. Train FFD and predict
python facts_discovery.py run --inputTag p0.5 --cudaId 0 --step feedback


  title={Fact Discovery from Knowledge Base via Facet Decomposition},
  author={Fu, Zihao and Lin, Yankai and Liu, Zhiyuan and Lam, Wai},
  booktitle={Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)},