Awesome Visual Question Answering:

A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.

Contributing

Please feel free to send me pull requests or email (leungjokie@gmail.com) to add links. Markdown format:

- [Paper Name](link) - Author 1 et al, **Conference Year**. [[code]](link)

Change Log

Mar.3rd,2019 The First version released.

Contributing
Change Log
Table of Contents
Papers
- Survey
- 2019
- 2018
  - NIPS 2018
  - AAAI 2018
  - IJCAI 2018
  - CVPR 2018
  - ACM MM 2018
  - ECCV 2018
  - OTHER
- 2017-2015
  - OTHER
  - ICCV 2017
VQA Challenge Leaderboard
- test-std 2018
- test-std 2017
Licenses
Reference and Acknowledgement

Papers

Survey

Visual question answering: Datasets, algorithms, and future challenges - Kushal Kafle et al, CVIU 2017.
Visual question answering: A survey of methods and datasets - Qi Wu et al, CVIU 2017.

2019

CVPR 2019

Information Maximizing Visual Question Generation - Ranjay Krishna et al, CVPR 2019. [code]
Social-IQ: A Question Answering Benchmark for Artificial Social Intelligence - Amir Zadeh et al, CVPR 2019. [code]
Learning to Compose Dynamic Tree Structures for Visual Contexts - Kaihua Tang et al, CVPR 2019. [code]
Transfer Learning via Unsupervised Task Discovery for Visual Question Answering - Hyeonwoo Noh et al, CVPR 2019. [code]
Video Relationship Reasoning using Gated Spatio-Temporal Energy Graph - Yao-Hung Hubert Tsai et al, CVPR 2019. [code]
Explainable and Explicit Visual Reasoning over Scene Graphs - Jiaxin Shi et al, CVPR 2019. [code]
MUREL: Multimodal Relational Reasoning for Visual Question Answering - Remi Cadene et al, CVPR 2019. [code]
Image-Question-Answer Synergistic Network for Visual Dialog - Dalu Guo et al, CVPR 2019. [code]
RAVEN: A Dataset for Relational and Analogical Visual rEasoNing - Chi Zhang et al, CVPR 2019. [project page]

AAAI 2019

Differential Networks for Visual Question Answering - Chenfei Wu et al, AAAI 2019. [code]
BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection - Hedi Ben-younes et al, AAAI 2019. [code]
Dynamic Capsule Attention for Visual Question Answering - Yiyi Zhou et al, AAAI 2019. [code]
Structured Two-stream Attention Network for Video Question Answering - Lianli Gao et al, AAAI 2019. [code]
Beyond RNNs: Positional Self-Attention with Co-Attention for Video Question Answering - Xiangpeng Li et al, AAAI 2019. [code]
WK-VQA: World Knowledge-enabled Visual Question Answering - Sanket Shah et al, AAAI 2019. [code]
Free VQA Models from Knowledge Inertia by Pairwise Inconformity Learning - Yiyi Zhou et al, AAAI 2019. [code]

OTHER

Focal Visual-Text Attention for Memex Question Answering - Junwei Liang et al, TPAMI 2019. [code]
Combining Multiple Cues for Visual Madlibs Question Answering - Tatiana Tommasi et al, IJCV 2019. [code]
Large-Scale Answerer in Questioner's Mind for Visual Dialog Question Generation - Sang-Woo Lee et al, ICLR 2019. [code]

2018

2017-2015

OTHER

Please check the other papers list from VQA area between 2017-2015 in awesome-vqa from JamesChuanggg,it seems that he hasn't maintained that project for a long time.Really appreciate for his work.I will merge his work to this list in the future.Stay tuned...

ICCV 2017

Learning to Reason: End-to-End Module Networks for Visual Question Answering - Ronghang Hu et al, ICCV 2017. [code]
Structured Attentions for Visual Question Answering - Chen Zhu et al, ICCV 2017. [code]
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation - Chuang Gan et al, ICCV 2017. [code]
Multi-modal Factorized Bilinear Pooling with Co-attention Learning for Visual Question Answering - Zhou Yu et al, ICCV 2017. [code]
An Analysis of Visual Question Answering Algorithms - Kushal Kafle et al, ICCV 2017. [code]
MUTAN: Multimodal Tucker Fusion for Visual Question Answering - Hedi Ben-younes et al, ICCV 2017. [code]
MarioQA: Answering Questions by Watching Gameplay Videos - Jonghwan Mun et al, ICCV 2017. [code]
Learning to Disambiguate by Asking Discriminative Questions - Yining Li et al, ICCV 2017. [code]

VQA Challenge Leaderboard

I will collect the leaderboard's implementations in the future.Stay tuned...

test-std 2018

VQA Challenge 2018 Leaderboard in EvalAI

test-std 2017

VQA Challenge 2017(Open-Ended) Leaderboard in EvalAI

Licenses

To the extent possible under law, Jokie Leung has waived all copyright and related or neighboring rights to this work.

Reference and Acknowledgement

awesome-image-captioning from Zhihong Chen
awesome-vqa from JamesChuanggg

Really appreciate for there contributions in this area.

JunweiLiang/awesome-visual-question-answering

Awesome Visual Question Answering:

Contributing

Change Log

Table of Contents

Papers

Survey

2019

CVPR 2019

AAAI 2019

OTHER

2018

NIPS 2018

AAAI 2018

IJCAI 2018

CVPR 2018

ACM MM 2018

ECCV 2018

OTHER

2017-2015

OTHER

ICCV 2017

VQA Challenge Leaderboard

test-std 2018

test-std 2017

Licenses

Reference and Acknowledgement