WahajAB/Real-World-VQA

Visual Question Answering (VQA) related to Real World Images by combining convolutional neural networks (CNNs) and transformer-based language models. Our architecture integrates ResNet50 for image feature extraction and BERT for question encoding. We evaluate our model on the processed DAQUAR dataset.

Jupyter Notebook

Stargazers

No one’s star this repository yet.