This repo demonstrates the efforts made for the ImageCLEF 2019 VQA-Med Q&A challenge.
As sepcified in the ImageClef site, the input for the model is constructed of an image + natural language question about said image, and an asnwer to the question.
The task is to predict the answer for a similar data which the answer was ommitted from.
You can see the the outline of the work done:
- (bringing data to expected format)[https://github.com/turner11/VQA-MED/blob/master/VQA-MED/VQA.Python/0_bringing_data_to_expected_format.ipynb]
- Pre process data (Clean + Enrich data)
- Data augmentation
- Create meta data
- Create the model
- Train the model
- Predict
- Create a submission in the expected format
For more information, please read the following paper:
Avi Turner, Assaf Spanier. "LSTM in VQA-Med, is It Really Needed? JCE Study on the ImageCLEF 2019 Dataset." CLEF (Working Notes) 2019
Please also cite this paper if you are using it for your research!