/VLSP2022-VQA-OhYeah

Solution for top 2 VLSP 2022 task Multilingual Visual Question Answering

Primary LanguageJupyter Notebook

ViT Transformer & MT5 for EVJQVA

This project contains the source code to train a Transformer model based on the Vision Transformer (ViT) architecture for the EVJQVA task.

Requirements

Python 3.7 or newer. To install the necessary libraries, run the following command:

pip install -r requirements.txt

Directory Structure

  • src/dataset/: Contains code to create and split the dataset.
  • src/models/: Contains code to initialize the model.
  • src/train.py: Contains code to train the model.
  • tests/: Contains code to test the model.
  • notebooks/: Contains a notebook to train the model.

Training the Model

To train the model, run the following command:

python src/train.py