/Neuro-Symbolic-Visual-Question-Answering

This PyTorch implementation of Neuro-Symbolic AI (NSAI) for Visual Question Answering on the Sort-of-CLEVR dataset combines deep learning's pattern recognition with symbolic reasoning.

Primary LanguageJupyter Notebook

Neuro-Symbolic AI for Visual Question Answering

Sort-of-CLEVR Dataset

Neuro-Symbolic AI allows us to combine Deep Learning’s superior pattern recognition abilities with the reasoning abilities of symbolic methods like program synthesis. This repository is an implementation of NSAI for Visual Question Answering on the Sort-of-CLEVR dataset using PyTorch. This implementation is inspired by the Neuro-Symbolic VQA paper by MIT-IBM Watson AI Lab.

The basic idea behind using NSAI for VQA is parsing the visual scene into a symbolic representation and using NLP to parse the query into an executable program which the program executor can use on the scene to find the answer. This implementation gets more than 99% on the Sort-of-CLEVR dataset.

Requirements

  • Pytorch <= 1.7
  • Torchtext <= 0.8.0
  • Torchvision <= 0.8.0
  • OpenCV
  • dlib
  • Scikit Learn
  • Pandas
  • Numpy

Usage

  • The Step-by-Step usage is in the NSAI on Sort-of-CLEVR.ipynb notebook from training the individual modules to plugging everything together to test it.
  • You can easily run this repository using Colab Open in Colab
  • To understand more about the design and workflow, check out NSAI Flow Diagram.pdf which contains the workflows of every component i.e. Perception Module, Semantic Parser and Program Executor.

References