Integrating Neural-Symbolic Reasoning with Variational Causal Inference Network for Explanatory Visual Question Answering
Dizhan Xue, Shengsheng Qian, and Changsheng Xu.
MAIS, Institute of Automation, Chinese Academy of Sciences
- Download the GQA Dataset.
- Download the GQA-OOD Dataset
- Download the bottom-up features and unzip it.
- Extracting features from the raw tsv files (Important: You need to run the code in Linux):
python ./preprocessing/extract_tsv.py --input $TSV_FILE --output $FEATURE_DIR
- We provide the annotations of GQA-REX Dataset in
model/processed_data/converted_explanation_train_balanced.json
andmodel/processed_data/converted_explanation_val_balanced.json
. - (Optional) You can construct the GQA-REX Dataset by yourself following instructions by its authors.
- Download our generated programs of the GQA dataset from Google Drive.
- (Optional) You can generate programs by yourself following this project.
We provide four models in model/model/model.py
.
- REX-VisualBert is from this project.
- REX-LXMERT replaces the backbone VisualBert of REX-VisualBert by LXMERT.
- VCIN is proposed in our ICCV 2023 paper "Variational Causal Inference Network for Explanatory Visual Question Answering".
- Pro-VCIN is proposed in TPAMI 2024 paper "Integrating Neural-Symbolic Reasoning with Variational Causal Inference Network for Explanatory Visual Question Answering".
Before training, you need to first generate the dictionary for questions, answers, explanations, and program modules:
cd ./model
python generate_dictionary --question $GQA_ROOT/question --exp $EXP_DIR --pro $PRO_DIR --save ./processed_data
The training process can be called as:
python main.py --mode train --anno_dir $GQA_ROOT/question --ood_dir $OOD_ROOT/data --sg_dir $GQA_ROOT/scene_graph --lang_dir ./processed_data --img_dir $FEATURE_DIR/features --bbox_dir $FEATURE_DIR/box --checkpoint_dir $CHECKPOINT --explainable True
To evaluate on the GQA-testdev set or generating submission file for online evaluation on the test-standard set, call:
python main.py --mode $MODE --anno_dir $GQA_ROOT/question --ood_dir $OOD_ROOT/data --lang_dir ./processed_data --img_dir $FEATURE_DIR/features --weights $CHECKPOINT/model_best.pth --explainable True
and set $MODE
to eval
or submission
accordingly.
If you find our papers or code helpful, please cite it as below. Thanks!
@inproceedings{xue2023variational,
title={Variational Causal Inference Network for Explanatory Visual Question Answering},
author={Xue, Dizhan and Qian, Shengsheng and Xu, Changsheng},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={2515--2525},
year={2023}
}
@article{xue2024integrating,
title={Integrating Neural-Symbolic Reasoning With Variational Causal Inference Network for Explanatory Visual Question Answering},
author={Xue, Dizhan and Qian, Shengsheng and Xu, Changsheng},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2024},
publisher={IEEE}
}