Resources for our AAAI 2022 paper: "LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification".
Check out our demo system! Note that the results will be slightly different from the paper, since we use an up-to-date Wikipedia as the evidence source whereas FEVER uses Wikipedia dated 2017.
- CUDA > 11
- Prepare requirements:
pip3 install -r requirements.txt
.- Also works for
allennlp==2.3.0, transformers==4.5.1, torch==1.8.1
.
- Also works for
- Set environment variable
$PJ_HOME
:export PJ_HOME=/YOUR_PATH/LOREN/
.
-
Pre-processed data at Google Drive. Unzip it and put them under
LOREN/data/
.- Data for training a Seq2Seq MRC is at
data/mrc_seq2seq_v5/
. - Data for training veracity prediction is at
data/fact_checking/v5/*.json
.- Note:
dev.json
uses ground truth evidence for validation, whereeval.json
uses predicted evidence for validation. This is consistent with the settings in KGAT.
- Note:
- Evidence retrieval models are not required for training LOREN, since we directly adopt the retrieved evidence from KGAT, which is at
data/fever/baked_data/
(using only during pre-processing). - Original data is at
data/fever/
(using only during pre-processing).
- Data for training a Seq2Seq MRC is at
-
Pre-trained checkpoints at Huggingface Models. Unzip it and put them under
LOREN/models/
.- Checkpoints for veracity prediciton are at
models/fact_checking/
. - Checkpoint for generative MRC is at
models/mrc_seq2seq/
. - Checkpoints for KGAT evidence retrieval models are at
models/evidence_retrieval/
(not used in training, displayed only for the sake of completeness).
- Checkpoints for veracity prediciton are at
For quick training and inference with pre-processed data & pre-trained models, please go to Veracity Prediction.
First, go to LOREN/src/
.
You'll need to download three external models in this step, i.e., two models from AllenNLP in parsing_client/sentence_parser.py
and a T5-based question generation model in qg_client/question_generator.py
. Don't worry, they'll be automatically downloaded.
- Run
python3 pproc_client/pproc_questions.py --roles eval train val test
- This generates cached json files:
AG_PREFIX/answer.{role}.cache
: extracted phrases are stored in the fieldanswers
.QG_PREFIX/question.{role}.cache
: generated questions are stored in the fieldcloze_qs
,generate_qs
andquestions
(two types of questions concatenated).
- Run
python3 pproc_client/pproc_mrc.py -o LOREN/data/mrc_seq2seq_v5
. - This generates files for Seq2Seq training in a HuggingFace style:
data/mrc_seq2seq_v5/{role}.source
: concatenated question and evidence text.data/mrc_seq2seq_v5/{role}.target
: answer (claim phrase).
- Go to
mrc_client/seq2seq/
, which is modified based on HuggingFace's examples. - Follow
script/train.sh
. - The best checkpoint will be saved in
$output_dir
(e.g.,models/mrc_seq2seq/
).- Best checkpoints are decided by ROUGE score on dev set.
- Run
python3 pproc_client/pproc_evidential.py --roles val train eval test -m PATH_TO_MRC_MODEL/
. - This generates files:
{role}.json
: files for veracity prediction. Assembled local premises are stored in the fieldevidential_assembled
.
Before training veracity prediction, we'll need a NLI prior from pre-trained NLI models, such as DeBERTa.
- Run
python3 pproc_client/pproc_nli_labels.py -i PATH_TO/{role}.json -m microsoft/deberta-large-mnli
. - Mind the order! The predicted classes [Contradict, Neutral, Entailment] correspond to [REF, NEI, SUP], respectively.
- This generates files:
- Adding a new field
nli_labels
to{role}.json
.
- Adding a new field
This part is rather easy (less pipelined :P). A good place to start if you want to skip the above pre-processing.
- Go to folder
check_client/
. - See what
scripts/train_*.sh
does.
- Stay in folder
check_client/
- Run
python3 fact_checker.py --params PARAMS_IN_THE_CODE
- This generates files:
results/*.predictions.jsonl
-
Go to folder
eval_client/
-
For Label Accuracy and FEVER score:
fever_scorer.py
-
For CulpA (turn on
--verbose
in testing):culpa.py
If you find our paper or resources useful to your research, please kindly cite our paper.
@article{Chen_Bao_Sun_Zhang_Chen_Zhou_Xiao_Li_2022,
title={LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification},
volume={36},
url={https://ojs.aaai.org/index.php/AAAI/article/view/21291}, DOI={10.1609/aaai.v36i10.21291},
abstractNote={Given a natural language statement, how to verify its veracity against a large-scale textual knowledge source like Wikipedia? Most existing neural models make predictions without giving clues about which part of a false claim goes wrong. In this paper, we propose LOREN, an approach for interpretable fact verification. We decompose the verification of the whole claim at phrase-level, where the veracity of the phrases serves as explanations and can be aggregated into the final verdict according to logical rules. The key insight of LOREN is to represent claim phrase veracity as three-valued latent variables, which are regularized by aggregation logical rules. The final claim verification is based on all latent variables. Thus, LOREN enjoys the additional benefit of interpretability --- it is easy to explain how it reaches certain results with claim phrase veracity. Experiments on a public fact verification benchmark show that LOREN is competitive against previous approaches while enjoying the merit of faithful and accurate interpretability. The resources of LOREN are available at: https://github.com/jiangjiechen/LOREN.},
number={10},
journal={Proceedings of the AAAI Conference on Artificial Intelligence},
author={Chen, Jiangjie and Bao, Qiaoben and Sun, Changzhi and Zhang, Xinbo and Chen, Jiaze and Zhou, Hao and Xiao, Yanghua and Li, Lei},
year={2022},
month={Jun.},
pages={10482-10491}
}