Data and Code for IJCAI 2020 paper Formal Query Building with Query Structure Prediction for Complex Question Answering over Knowledge Base is available for research purposes.
This project only includes the processing of the LC-QuAD (Trivedi et al., 2017) dataset, and we are sorry that the source code of the remaining two data sets, WebQuestions (Berant et al., 2013) and ComplexQuestions (Bao et al, 2016), cannot be released due to the lack of organization currently. We will release them in a unified way in future work. In view of the preprocessing data link provided by (Luo et al., 2018) for WebQ and CompQ is no longer valid, we provide a new link here for subsequent researchers.
- Python 3.6
- Pytorch 1.2.0
- DBpedia Version 2016-04 (Note the version. If you use the latest DBpedia version, the answers to some questions will not be retrieved. Here, we also performed a preprocessing on it, and only retained the English part related to the LC-QuAD data set.)
- SPARQL service (constructed by Virtuoso or Apache Jena Fuseki)
Recently, we have updated AQGNet, we changed AQG from undirected graph to the directed graph and added beam search in structure prediction. According to this update, the performance of our approach has been further improved on the LC-QuAD.
Dataset | AQG prediction | Precision | Recall | F1-score |
---|---|---|---|---|
LC-QuAD | 72.8 | 77.38 | 76.73 | 76.59 |
We provide the above trained AQGNet model. Please download and unzip, move it to the ./runs
directory.
Here we provide the candidate queries of the training set, the verification set and the test set respectively, where the candidate queries of the test set are obtained from the prediction results of the above model. Please download and unzip, move it to the ./data
directory.
We also provide the trained query ranking model. Please download and unzip, move it to the ./query_ranking/runs
directory.
Download Glove Embedding and put glove.42B.300d.txt
under ./data/
directory.
cd ./preprocess
sh run_me.sh
Modify the following content in ./train.sh
.
devices=$1
- Replace
$1
with the id of the GPU to be used, such as0
.
Then, execute the following command for training.
sh train.sh
The trained model file is saved under ./runs
directory.
The path format of the trained model is ./runs/RUN_ID/checkpoints/best_snapshot_epoch_xx_best_val_acc_xx_model.pt
Modify the following content in ./eval.sh
.
devices=$1
save_name=$2
dbpedia_endpoint=$3
- Replace
$1
with the id of the GPU to be used. - Replace
$2
with the path of the trained model. - Replace
$3
with the address of the established DBpedia SPARQL service, such ashttp://10.201.158.104:3030/dbpedia/sparql
The result of AQGNet structure prediction is saved under the used model directory. The path format of result is ./runs/RUN_ID/results.pkl
.
Then, execute the following command for structure prediction.
sh eval.sh
Modify the following content in ./generate_queries.sh
.
test_data=$1 # structure prediction results path
dbpedia_endpoint=$2 # http://10.201.158.104:3030/dbpedia/sparql
The candidate queries for the training set, valid set, and test set are saved under ./data
directory.
cd ./query_ranking
sh run_me.sh
Modify the following content in ./query_ranking/train.sh
.
devices=$1
- Replace
$1
with the id of the GPU to be used. Then, execute the following command for training query ranking model.
cd ./query_ranking
sh train.sh
The trained query ranking model file is saved under ./query_ranking/runs
directory.
Modify the following content in ./query_ranking/eval.sh
.
devices=$1
save_name=$2
dbpedia_endpoint=$3
- Replace
$1
with the id of the GPU to be used. - Replace
$2
with the path of the trained model. - Replace
$3
with the address of the established DBpedia SPARQL service, such ashttp://10.201.158.104:3030/dbpedia/sparql
.
Then, execute the following command for the final results of question answering.
cd ./query_ranking
sh eval.sh
If you use AQGNet, please cite the following work.
@inproceedings{DBLP:conf/ijcai/ChenLHQ20,
author = {Yongrui Chen and
Huiying Li and
Yuncheng Hua and
Guilin Qi},
editor = {Christian Bessiere},
title = {Formal Query Building with Query Structure Prediction for Complex
Question Answering over Knowledge Base},
booktitle = {Proceedings of the Twenty-Ninth International Joint Conference on
Artificial Intelligence, {IJCAI} 2020 [scheduled for July 2020, Yokohama,
Japan, postponed due to the Corona pandemic]},
pages = {3751--3758},
publisher = {ijcai.org},
year = {2020},
url = {https://doi.org/10.24963/ijcai.2020/519},
doi = {10.24963/ijcai.2020/519},
timestamp = {Mon, 13 Jul 2020 18:09:15 +0200},
biburl = {https://dblp.org/rec/conf/ijcai/ChenLHQ20.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}