PVANet: Lightweight Deep Neural Networks for Real-time Object Detection
by Sanghoon Hong, Byungseok Roh, Kye-hyeon Kim, Yeongjae Cheon, Minje Park (Intel Imaging and Camera Technology) Presented in EMDNN2016, a NIPS2016 workshop (arXiv link)
Introduction
This repository is a fork from py-faster-rcnn and demonstrates the performance of PVANet.
You can refer to py-faster-rcnn README.md and faster-rcnn README.md for more information.
Desclaimer
Please note that this repository doesn't contain our in-house codes used in the published article.
- This version of py-faster-rcnn is slower than our in-house runtime code (e.g. image pre-processing code written in Python)
- PVANet was trained by our in-house deep learning library, not by this implementation.
- There might be a tiny difference in VOC2012 test results, because some hidden parameters in py-faster-rcnn may be set differently with ours.
Citing PVANet
If you want to cite this work in your publication:
@article{hong2016pvanet,
title={{PVANet}: Lightweight Deep Neural Networks for Real-time Object Detection},
author={Hong, Sanghoon and Roh, Byungseok and Kim, Kye-Hyeon and Cheon, Yeongjae and Park, Minje},
journal={arXiv preprint arXiv:1611.08588},
year={2016}
}
Installation
-
Install Dependency
sudo pip install Cython
-
Clone the Faster R-CNN repository
# Make sure to clone with --recursive git clone --recursive https://github.com/sanghoon/pva-faster-rcnn.git
-
We'll call the directory that you cloned Faster R-CNN into
FRCN_ROOT
. Build the Cython modulescd $FRCN_ROOT/lib make
-
Build Caffe and pycaffe
cd $FRCN_ROOT/caffe-fast-rcnn # Now follow the Caffe installation instructions here: # http://caffe.berkeleyvision.org/installation.html # For your Makefile.config: # Uncomment `WITH_PYTHON_LAYER := 1` cp Makefile.config.example Makefile.config make -j8 && make pycaffe
-
Download PVANet detection model for VOC2007
cd $FRCN_ROOT ./models/pvanet/download_voc2007.sh
-
Download PVANet detection model for VOC2012 (published model)
cd $FRCN_ROOT ./models/pvanet/download_voc_best.sh
-
(Optional) Download all available models (including pre-trained and compressed models)
cd $FRCN_ROOT ./models/pvanet/download_all_models.sh
-
(Optional) Download ILSVRC2012 (ImageNet) classification model
cd $FRCN_ROOT ./models/pvanet/download_imagenet_model.sh
-
(Optional) If the scripts don't work, please download the models from ...
Model Google Drive PVANet for VOC2007 link PVANet for VOC2012 link PVANet for VOC2012 (compressed) link PVANet for ILSVRC2012 (ImageNet) link PVANet pre-trained link
How to run the demo
-
Download PVANet detection models
-
Run the demo script
cd $FRCN_ROOT ./tools/test_pvanet.py
How to test on the VOC dataset
-
Download PASCAL VOC 2007 and 2012 -- Follow the instructions in py-faster-rcnn README.md
-
PVANet on PASCAL VOC 2007
cd $FRCN_ROOT ./tools/test_net.py --net models/pvanet/pva9.1/PVA9.1_ImgNet_COCO_VOC0712.caffemodel --def models/pvanet/pva9.1/faster_rcnn_train_test_21cls.pt --cfg models/pvanet/cfgs/submit_1019.yml --gpu 0
-
PVANet (compressed)
cd $FRCN_ROOT ./tools/test_net.py --net models/pvanet/pva9.1/PVA9.1_ImgNet_COCO_VOC0712plus_compressed.caffemodel --def models/pvanet/pva9.1/faster_rcnn_train_test_ft_rcnn_only_plus_comp.pt --cfg models/pvanet/cfgs/submit_1019.yml --gpu 0
Expected results
Mean Average Precision on VOC detection tasks
Model | VOC2007 mAP (%) | VOC2012 mAP (%) |
---|---|---|
PVANet+ (VOC2007) | 84.9 | N/A |
PVANet+ (VOC2012) | 89.8 | 84.2 |
PVANet+ (VOC2012 + compressed) | 87.8 | 83.7 |
- The training set for the VOC2012 model includes the VOC2007 test set. Therefore the accuracies on VOC2007 of the model are not meaningful; They're shown here just for reference
Validation error on ILSVRC2012
Input size | Top-1 error (%) | Top-5 error (%) |
---|---|---|
192x192 | 30.00 | N/A |
224x224 | 27.66 | 8.84 |
- We re-trained a 224x224 model from the '192x192' model as a base model.