Q-engineering:
This is the SSD branch of weiliu89/caffe
Modified for OpenCV 4, cuDNN 8 and Python3:
- include/caffe/common.hpp
- src/caffe/util/im_transforms.cpp
- src/caffe/layers/cudnn_conv_layer.cpp
- Makefile
Add custom Makefile.config for Raspberry Pi 32OS - 64OS - Ubuntu 18.04 - Ubuntu 20.04 - Jetson Nano
We strongly advise you to follow this guide for the Raspberry Pi 4, Jetson Nano or PC with Ubuntu
Adapted for cuDNN version 8.0
Fixed for the obsolete cuDNN API calls:
- cudnnGetConvolutionForwardAlgorithm
- cudnnGetConvolutionBackwardFilterAlgorithm
- cudnnGetConvolutionBackwardDataAlgorithm
generating errors like CUDNN_CONVOLUTION_FWD_SPECIFY_WORKSPACE_LIMIT was not declared in this scope.
December 2020 update:
cudnn_conv_layer.cpp:
- fixed typo (
#endif } → } #endif
) - replaced 'CUDNN_CONVOLUTION_BWD_FILTER_ALGO_1 → CUDNN_CONVOLUTION_BWD_FILTER_ALGO_0' due to CUDNN_STATUS_SUCCESS (4 vs. 0) CUDNN_STATUS_INTERNAL_ERROR issues
math_functions.cu:
- line 1: replace
<math_functions.h> → <cuda_runtime.h>
Februari 2021 update:
Makefile.config.JetsonNano adapted for Jetson TX2 (Thanks to @henudwj )
SSD: Single Shot MultiBox Detector
By Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg.
Introduction
SSD is an unified framework for object detection with a single network. You can use the code to train/evaluate a network for object detection task. For more details, please refer to our arXiv paper and our slide.
System | VOC2007 test mAP | FPS (Titan X) | Number of Boxes | Input resolution |
---|---|---|---|---|
Faster R-CNN (VGG16) | 73.2 | 7 | ~6000 | ~1000 x 600 |
YOLO (customized) | 63.4 | 45 | 98 | 448 x 448 |
SSD300* (VGG16) | 77.2 | 46 | 8732 | 300 x 300 |
SSD512* (VGG16) | 79.8 | 19 | 24564 | 512 x 512 |
Note: SSD300* and SSD512* are the latest models. Current code should reproduce these results.
Citing SSD
Please cite SSD in your publications if it helps your research:
@inproceedings{liu2016ssd,
title = {{SSD}: Single Shot MultiBox Detector},
author = {Liu, Wei and Anguelov, Dragomir and Erhan, Dumitru and Szegedy, Christian and Reed, Scott and Fu, Cheng-Yang and Berg, Alexander C.},
booktitle = {ECCV},
year = {2016}
}
Contents
Installation
- Get the code. We will call the directory that you cloned Caffe into
$CAFFE_ROOT
git clone https://github.com/weiliu89/caffe.git
cd caffe
git checkout ssd
- Build the code. Please follow Caffe instruction to install all necessary packages and build it.
# Modify Makefile.config according to your Caffe installation.
cp Makefile.config.example Makefile.config
make -j8
# Make sure to include $CAFFE_ROOT/python to your PYTHONPATH.
make py
make test -j8
# (Optional)
make runtest -j8
Preparation
-
Download fully convolutional reduced (atrous) VGGNet. By default, we assume the model is stored in
$CAFFE_ROOT/models/VGGNet/
-
Download VOC2007 and VOC2012 dataset. By default, we assume the data is stored in
$HOME/data/
# Download the data.
cd $HOME/data
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
# Extract the data.
tar -xvf VOCtrainval_11-May-2012.tar
tar -xvf VOCtrainval_06-Nov-2007.tar
tar -xvf VOCtest_06-Nov-2007.tar
- Create the LMDB file.
cd $CAFFE_ROOT
# Create the trainval.txt, test.txt, and test_name_size.txt in data/VOC0712/
./data/VOC0712/create_list.sh
# You can modify the parameters in create_data.sh if needed.
# It will create lmdb files for trainval and test with encoded original image:
# - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_trainval_lmdb
# - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_test_lmdb
# and make soft links at examples/VOC0712/
./data/VOC0712/create_data.sh
Train/Eval
- Train your model and evaluate the model on the fly.
# It will create model definition files and save snapshot models in:
# - $CAFFE_ROOT/models/VGGNet/VOC0712/SSD_300x300/
# and job file, log file, and the python script in:
# - $CAFFE_ROOT/jobs/VGGNet/VOC0712/SSD_300x300/
# and save temporary evaluation results in:
# - $HOME/data/VOCdevkit/results/VOC2007/SSD_300x300/
# It should reach 77.* mAP at 120k iterations.
python examples/ssd/ssd_pascal.py
If you don't have time to train your model, you can download a pre-trained model at here.
- Evaluate the most recent snapshot.
# If you would like to test a model you trained, you can do:
python examples/ssd/score_ssd_pascal.py
- Test your model using a webcam. Note: press esc to stop.
# If you would like to attach a webcam to a model you trained, you can do:
python examples/ssd/ssd_pascal_webcam.py
Here is a demo video of running a SSD500 model trained on MSCOCO dataset.
-
Check out
examples/ssd_detect.ipynb
orexamples/ssd/ssd_detect.cpp
on how to detect objects using a SSD model. Check outexamples/ssd/plot_detections.py
on how to plot detection results output by ssd_detect.cpp. -
To train on other dataset, please refer to data/OTHERDATASET for more details. We currently add support for COCO and ILSVRC2016. We recommend using
examples/ssd.ipynb
to check whether the new dataset is prepared correctly.
Models
We have provided the latest models that are trained from different datasets. To help reproduce the results in Table 6, most models contain a pretrained .caffemodel
file, many .prototxt
files, and python scripts.
-
PASCAL VOC models:
-
COCO models:
-
ILSVRC models:
[1]We use examples/convert_model.ipynb
to extract a VOC model from a pretrained COCO model.