/Relation-Networks

Keras implementation of Relation Networks for Visual Question Answering using the CLEVR dataset.

Primary LanguagePython

Relation Networks

"A simple neural network module for relational reasoning" Adam Santoro, David Raposo, David G.T. Barrett, Mateusz Malinowski, Razvan Pascanu, Peter Battaglia, Timothy Lillicrap https://arxiv.org/pdf/1706.01427.pdf

Relation Networks are a neural network module which are specialized to learn relations, just as convolutional kernels are specialized to process images. RNs are useful for Visual Question Answering, where they hold state-of-the-art results.

My Keras implementation uses the Functional API to define the network. One generalization to RNs is the use of a selection kernel which picks k distinct random objects from the processed image tensor instead of all n^2 objects. This allows for a much smaller number of relation vectors when k << n^2.

CLEVR.py

Visual Question Answering implemented on the CLEVR dataset (https://cs.stanford.edu/people/jcjohns/clevr/).

Alt text

Above is an example image which could have the following question: "Q: Are there an equal number of large things and metal spheres?"

Architecture

Images are processed using a CNN, while the questions are processed using an LSTM. These tensors are then decomposed into objects and fed as input into the RN module. Alt text

Experiments

60000 Questions / 6000 Images

Training a RN using 10% of the train data results in ~80% accuracy after 100 epochs, which shows that using a random entity selection kernel still results in compelling results after a short training period.

Accuracy Plot

Alt text

Loss Plot

Alt text

Misc

MNIST.py

Implementation of Relation Networks on MNIST demonstrating a simpler RN architecture.

RN.py

First working prototype.