vqav2
There are 8 repositories under vqav2 topic.
rentainhe/TRAR-VQA
[ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"
phiyodr/vqaloader
PyTorch DataLoader for many VQA datasets
vtu81/NaiveVQA
A Visual Question Answering model implemented in MindSpore and PyTorch. The model is a reimplementation of the paper *Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering*. It's our final project for course DL4NLP at ZJU.
williamcfrancis/Visual-Question-Answering-using-Stacked-Attention-Networks
Pytorch implementation of VQA using Stacked Attention Networks: Multimodal architecture for image and question input, using CNN and LSTM, with stacked attention layer for improved accuracy (54.82%). Includes visualization of attention layers. Contributions welcome. Utilizes Visual VQA v2.0 dataset.
itsShnik/adaptively-finetuning-transformers
Adaptively fine tuning transformer based models for multiple domains and multiple tasks
BrightQin/RWSAN
Official implementation of "Deep Residual Weight-Sharing Attention Network with Low-Rank Attention for Visual Question Answering" (RWSAN) published in the IEEE Transactions on Multimedia (TMM), 2022.
rentainhe/TRAR-Feature-Extraction
Grid features extraction for ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"
shreyas21563/VQA-using-BLIP
Leveraging the BLIP Model for Visual Question Answering: A Comparative Analysis on VQA and DAQUAR Datasets