A tensorlayer implementation of Neural Module Networks.
This project refered to" Learning to Reason: End-to-End Module Networks for Visual Question Answering". It implemented the model for the SHAPES dataset with tensorlayer, and uses ground-truth layout (behavioral cloning from expert) for training. It is based on the following paper:
- R. Hu, J. Andreas, M. Rohrbach, T. Darrell, K. Saenko, Learning to Reason: End-to-End Module Networks for Visual Question Answering. in ICCV, 2017. (PDF)
@inproceedings{hu2017learning,
title={Learning to Reason: End-to-End Module Networks for Visual Question Answering},
author={Hu, Ronghang and Andreas, Jacob and Rohrbach, Marcus and Darrell, Trevor and Saenko, Kate},
booktitle={Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
year={2017}
}
- Install Python 3 (Anaconda recommended: https://www.continuum.io/downloads).
- Install TensorFlow v1.0.0 (Note: newer or older versions of TensorFlow may fail to work due to incompatibility with TensorFlow Fold):
pip install tensorflow-gpu==1.0.0
- Install TensorFlow Fold (which is needed to run dynamic graph):
pip install https://storage.googleapis.com/tensorflow_fold/tensorflow_fold-0.0.1-py3-none-linux_x86_64.whl
- Install tensorlayer1.4.1:
pip install tensorlayer==1.4.1
- Install cuda & cudnn (in Anaconda):
connda install cuda
connda install cudnn==5.1
- Download this or clone with Git, and then enter the root directory of the repository:
git clone https://github.com/jiaqi-xi/Neural-Module-Networks.Tensorlayer.git
A copy of the SHAPES dataset is contained in this repository under exp_shapes/shapes_dataset
. The ground-truth module layouts (expert layouts) we use in our experiments are also provided under exp_shapes/data/*_symbols.json
. The script to obtain the expert layouts from the annotations is in exp_shapes/data/get_ground_truth_layout.ipynb
.
-
Add the root of this repository to PYTHONPATH:
export PYTHONPATH=.:$PYTHONPATH
-
Train with ground-truth layout (behavioral cloning from expert):
python exp_shapes/train_shapes_gt_layout.py
Note: by default, the above scripts use GPU 0. To train on a different GPU, set the --gpu_id
flag. During training, the script will write TensorBoard events to exp_shapes/tb/
and save the snapshots under exp_shapes/tfmodel/
.
-
Add the root of this repository to PYTHONPATH:
export PYTHONPATH=.:$PYTHONPATH
-
Evaluate shapes_gt_layout (behavioral cloning from expert):
python exp_shapes/eval_shapes.py --exp_name shapes_gt_layout --snapshot_name 00040000 --test_split test
Note: the above evaluation scripts will print out the accuracy and also save it under exp_shapes/results/
. By default, the above scripts use GPU 0, and evaluate on the test split of SHAPES. To evaluate on a different GPU, set the --gpu_id
flag. To evaluate on the validation split, use --test_split val
instead.