This is pytorch implementation for ICCV21 paper: Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real Images
The codebase is tested with Python3.8 and Pytorch1.7.
Part of this codebase is built on NSCL. Great thanks to the authors! Please refer to the prerequisites of NSCL codebase. Specially, install Jacinle:
git clone https://github.com/vacancy/Jacinle --recursive
export PATH=<path_to_jacinle>/bin:$PATH
Download the GQA dataset (ver1.2) into data/orig_data
.
Download the extracted images features from this link into data/features
.
The data
directory should look like:
data/orig_data/questions1.2
- train_balanced_questions.json
- val_balanced_questions.json
- testdev_balanced_questions.json
data/orig_data/sceneGraphs
- train_sceneGraphs.json
- val_sceneGraphs.json
data/features
- sgg_features.h5
- sgg_info.json
...
Download the trained model from this link and put it under ckpt/cco_trained
. Then run the following command. It should reach accuracy 55.81 (on testdev split).
sh scripts/cco_test.sh
Run the following command to train:
sh scripts/cco_train.sh
The dataset splits (filtered by operation weights, easy/hard) as in Table 4 and Fig 5 can be downloaded here.
@InProceedings{li2021calibrating,
author = {Li, Zhuowan and Stengel-Eskin, Elias and Zhang, Yixiao and Xie, Cihang and Tran, Quan and Van Durme, Benjamin and Yuille, Alan},
title = {Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real Images},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
year = {2021}
}