affordance-net: A Jupyter Notebook repository from nqanh

AffordanceNet: An End-to-End Deep Learning Approach for Object Affordance Detection

By Thanh-Toan Do*, Anh Nguyen*, Ian Reid (* equal contribution)

Requirements
Installation
Demo
Training
Notes

Requirements

Caffe
- Install Caffe: Caffe installation instructions.
- Caffe must be built with support for Python layers.
Hardware
- To train a full AffordanceNet, you'll need a GPU with ~11GB (e.g. Titan, K20, K40, Tesla, ...).
- To test a full AffordanceNet, you'll need ~6GB GPU.
[Optional] For robotic demo
- ROS Indigo
- rospy
- OpenNI
- PrimeSensor

Installation

Clone the AffordanceNet repository into your $AffordanceNet_ROOT folder.
Build Caffe and pycaffe:
- cd $AffordanceNet_ROOT/caffe-affordance-net
- # Now follow the Caffe installation instructions: http://caffe.berkeleyvision.org/installation.html
- # If you're experienced with Caffe and have all of the requirements installed and your Makefile.config in place, then simply do:
- make -j8 && make pycaffe
Build the Cython modules:
- cd $AffordanceNet_ROOT/lib
- make
Download pretrained weights (Google Drive, One Drive). This weight is trained on the training set of the IIT-AFF dataset:
- Extract the file you downloaded to $AffordanceNet_ROOT
- Make sure you have the caffemodel file like this: '$AffordanceNet_ROOT/pretrained/AffordanceNet_200K.caffemodel

Demo

After successfully completing installation, you'll be ready to run the demo.

Export pycaffe path:
- export PYTHONPATH=$AffordanceNet_ROOT/caffe-affordance-net/python:$PYTHONPATH
Demo on static images:
- cd $AffordanceNet_ROOT/tools
- python demo_img.py
- You should see the detected objects and their affordances.
(Optional) Demo on depth camera (such as Asus Xtion):
- With AffordanceNet and the depth camera, you can easily select the interested object and its affordances for robotic applications such as grasping, pouring, etc.
- First, launch your depth camera with ROS, OpenNI, etc.
- cd $AffordanceNet_ROOT/tools
- python demo_asus.py
- You may want to change the object id and/or affordance id (line 380, 381 in demo_asus.py). Currently, we select the bottle and its grasp affordance.
- The 3D grasp pose can be visualized with rviz. You should see something like this:

Training

We train AffordanceNet on IIT-AFF dataset
- We need to format IIT-AFF dataset as in Pascal-VOC dataset for training.
- For your convinience, we did it for you. Just download this file (Google Drive, One Drive) and extract it into your $AffordanceNet_ROOT folder.
- The extracted folder should contain three sub-folders: $AffordanceNet_ROOT/data/cache, $AffordanceNet_ROOT/data/imagenet_models, and $AffordanceNet_ROOT/data/VOCdevkit2012 .
Train AffordanceNet:
- cd $AffordanceNet_ROOT
- ./experiments/scripts/faster_rcnn_end2end.sh [GPU_ID] [NET] [--set ...]
- e.g.: ./experiments/scripts/faster_rcnn_end2end.sh 0 VGG16 pascal_voc
- We use pascal_voc alias although we're training using the IIT-AFF dataset.

Notes

AffordanceNet vs. Mask-RCNN: AffordanceNet can be considered as a general version of Mask-RCNN when we have multiple classes inside each instance.
The current network achitecture is slightly diffrent from the paper, but it achieves the same accuracy.
Train AffordanceNet on your data:
- Format your images as in Pascal-VOC dataset (as in $AffordanceNet_ROOT/data/VOCdevkit2012 folder).
- Prepare the affordance masks (as in $AffordanceNet_ROOT/data/cache folder): For each object in the image, we need to create a mask and save as a .sm file. See $AffordanceNet_ROOT/utils for details.

Citing AffordanceNet

If you find AffordanceNet useful in your research, please consider citing:

@inproceedings{AffordanceNet18,
  title={AffordanceNet: An End-to-End Deep Learning Approach for Object Affordance Detection},
  author={Do, Thanh-Toan and Nguyen, Anh and Reid, Ian},
  booktitle={International Conference on Robotics and Automation (ICRA)},
  year={2018}
}

If you use IIT-AFF dataset, please consider citing:

@inproceedings{Nguyen17,
  title={Object-Based Affordances Detection with Convolutional Neural Networks and Dense Conditional Random Fields},
  author={Nguyen, Anh and Kanoulas, Dimitrios and Caldwell, Darwin G and Tsagarakis, Nikos G},
  booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year={2017},
}

License

MIT License

Acknowledgement

This repo used a lot of source code from Faster-RCNN

nqanh/affordance-net