vqav2

There are 8 repositories under vqav2 topic.

rentainhe/TRAR-VQA
[ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"
Language:Python66 3 1018
phiyodr/vqaloader
PyTorch DataLoader for many VQA datasets
Language:Python9 2 01
vtu81/NaiveVQA
A Visual Question Answering model implemented in MindSpore and PyTorch. The model is a reimplementation of the paper *Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering*. It's our final project for course DL4NLP at ZJU.
Language:Jupyter Notebook9 2 04
williamcfrancis/Visual-Question-Answering-using-Stacked-Attention-Networks
Pytorch implementation of VQA using Stacked Attention Networks: Multimodal architecture for image and question input, using CNN and LSTM, with stacked attention layer for improved accuracy (54.82%). Includes visualization of attention layers. Contributions welcome. Utilizes Visual VQA v2.0 dataset.
Language:Jupyter Notebook5 1 26
itsShnik/adaptively-finetuning-transformers
Adaptively fine tuning transformer based models for multiple domains and multiple tasks
Language:Python4 0 182
BrightQin/RWSAN
Official implementation of "Deep Residual Weight-Sharing Attention Network with Low-Rank Attention for Visual Question Answering" (RWSAN) published in the IEEE Transactions on Multimedia (TMM), 2022.
Language:Python3 1 01
rentainhe/TRAR-Feature-Extraction
Grid features extraction for ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"
Language:Python2 1 01
shreyas21563/VQA-using-BLIP
Leveraging the BLIP Model for Visual Question Answering: A Comparative Analysis on VQA and DAQUAR Datasets
Language:Jupyter Notebook0 1 01

vqav2

rentainhe/TRAR-VQA

phiyodr/vqaloader

vtu81/NaiveVQA

williamcfrancis/Visual-Question-Answering-using-Stacked-Attention-Networks

itsShnik/adaptively-finetuning-transformers

BrightQin/RWSAN

rentainhe/TRAR-Feature-Extraction

shreyas21563/VQA-using-BLIP