This repository holds the codes for my master's report completed in May 2019 for my MSIS degree and specialization in Machine Learning from UT-Austin. The dataset used in this project will be published with the paper.
This research contributes to the understanding of the unique information needs and challenges faced by blind users with the goal to improve the status quo of visual assistive technologies.
Image-question pairings from the VizWiz datasets pose significant challenges to existing machine learning algorithms with inconsistent image quality and colloquialism. Visual question answering (VQA) tasks that involves crowdsourcing and community answering can also be better divided and assigned to the appropriate crowd workers based on their experiences, preferences and skills. An algorithm to identify the skills involved can be potentially transferred to another task utilizing machine and human computations to better assist visually impaired users.
The original VQA evaluation repo is available at https://github.com/GT-Vision-Lab/VQA
python 3.7.2
scikit-image 0.14.0
tensorflow core 1.13
keras 2.2.4
scipy 1.2.0
pandas 0.24.0
cv2
pip3 install --upgrade pip
pip3 install -r requirements.txt