Introduction
This is the source code and additional visualization examples of "Radial Graph Convolutional Network for Visual Question Generation", implemented by Tan Wang.
Xing Xu, Tan Wang, Yang Yang, Alan Hanjalic and Heng Tao Shen. “Radial Graph Convolutional Network for Visual Question Generation”. IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2020
The major contributions of this work are:
- Different from the existing approaches that typically treat the VQG task as a reversed VQA task, we propose a novel answer-centric approach for the VQG task, which effectively models the associations between the answer and its relevant image regions.
- To our best knowledge, we are the first to apply GCN model for the VQG task and devise a new radial graph structure with graphic attention for superior question generation performance and interpretable model behavior.
- We conduct comprehensive experiments on three benchmark datasets to verify the advantage of our proposed method on generating meaningful questions on the VQG task and boosting the existing VQA methods on the challenging zero-shot VQA task.
Code Structure
├── Radial-GCN/
| ├── run_vqg.py /* The main run files
| ├── layer_vqg.py /* Files for the model layer and structure (GCN, VQG)
| ├── dataset_vqg.py /* Files for construct vqg dataset
| ├── utils.py /* Files for tools
| ├── main.py /* Files for caption evaluation
| ├── supp_questions /* Files for generate questions for supplementary dataset for zero shot VQA
| ├── draw_*.py /* Files for drawing and visualisation
| ├── readme.md
│ ├── ZS_VQA/
| ├── data/ /* Data file for zs_vqa
│ ├── data/ /* Data files for training vqg
| ├── tools/ /* The modified file from bottom-up attention
| ├── process_image_vqg.py /* Files for preprocess image
| ├── preprocess_text.py /* Files for preprocess text
Results
Method | VQA2 | Visual7W | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
BLEU-1 | BLEU-4 | METEOR | CIDEr | ROUGE-L | BLEU-1 | BLEU-4 | METEOR | CIDEr | ROUGE-L | |
LSTM (Baseline) | 0.381 | 0.152 | 0.198 | 1.32 | 0.471 | 0.447 | 0.202 | 0.192 | 1.13 | 0.468 |
LSTM-AN (Baseline) | 0.492 | 0.228 | 0.243 | 1.62 | 0.526 | 0.463 | 0.219 | 0.229 | 1.34 | 0.501 |
SAT (ICML'15) | 0.494 | 0.231 | 0.244 | 1.65 | 0.534 | 0.467 | 0.223 | 0.234 | 1.34 | 0.503 |
IVQA (CVPR'18) | 0.502 | 0.239 | 0.257 | 1.84 | 0.553 | 0.472 | 0.227 | 0.237 | 1.36 | 0.508 |
iQAN (CVPR'18) | 0.526 | 0.271 | 0.268 | 2.09 | 0.568 | 0.488 | 0.231 | 0.251 | 1.44 | 0.520 |
Ours (w/o attention) | 0.529 | 0.273 | 0.269 | 2.09 | 0.570 | 0.494 | 0.233 | 0.257 | 1.47 | 0.524 |
Ours | 0.534 | 0.279 | 0.271 | 2.10 | 0.572 | 0.501 | 0.236 | 0.259 | 1.52 | 0.527 |
Model | VQA Model | VQG Model | VQA val | Norm test | ZS-VQA test | |||||
---|---|---|---|---|---|---|---|---|---|---|
Bottom-up | BAN | IVQA | Ours | Acc@1 | Acc@Hum | Acc@1 | Acc@Hum | Acc@1 | Acc@Hum | |
1 | √ | 59.6 | 66.6 | 48.8 | 56.9 | 0 | 0 | |||
2 | √ | √ | 59.0 | 66.1 | 48.3 | 56.0 | 29.2 | 39.4 | ||
3 | √ | √ | 59.1 | 66.3 | 48.3 | 56.2 | 30.1 | 40.4 | ||
4 | √ |
60.6 | 67.8 | 49.8 | 58.9 | 0 | 0 | |||
5 | √ | √ | 60.1 | 67.5 | 49.2 | 58.7 | 30.7 | 41.3 |
Visual Examples
More details can be refer to our main text and supplementary.
View VQG Process
View Question Distribution
View Supp. for ZS-VQA
View More Examples
”Q”, “A” and “Q*” denote the ground truth question, the given answer and generated question respectively.