SSD : A very very simple ssd implementation using only pytorch and numpy

This repo contains some simple codes for me to learn the basic of object detection 中文请点击, SSD(Single Shot MultiBox Detector) is a somewhat simple but powerful model to get started. So I try to implement it by myself, hoping I can get more insight in object dectection land. It's really amazing with deep learning and little code that machines can catch object show in the world. I try to reimplement it more readable and with clear codes. I hope this repo will help people who want to learn object detection and feel hard to get started.

Code structure

train.py
voc_dataset.py
eval.py
lib
- augmentations.py
- model.py
- ssd_loss.py
- multibox_endoder.py
- utils.py
- voc_eval.py
config.py
demo.ipynb

getting started

Install Pytorch, I recommand Anaconda as your packge manager, and you can simplely install Pytorch by

conda install pytorch torchvision cudatoolkit=9.0 -c pytorch

for example.

download VOC2007 trainval and VOC2012 trainval, download VOC2007 testset, extract them and put them in a folder
if you are using a linux machine, simple run

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar

tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtrainval_11-May-2012.tar
tar xvf VOCtest_06-Nov-2007.tar

the structures would like

~/VOCdevkit/
    -- VOC2007
    -- VOC2012

then ~/VOCdevkit is your VOC root.

Train

for training ssd you need pretrained VGG weights as your basenet's starting point. so download this weight from https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth, then put it in weights folder.

mkdir weights
cd weights
wget https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth

vim config.py to change learning rate and batch size num...... and so on. It's not really neccesery, some thing you need to care about is VOC_ROOT, change it to your VOC root where you put your VOC data.
A simple command is all you need

python train.py

nohup python -u train.py &
watch -n 1 tail nohup.out


#ctrl+c to quit!

Question:
- I have a GPU device, how do I use it? The code will detect that and use cuda:0 as default otherwise it use cpu
- I get oom error. just vim config.py and reduce batch size
- I get nan loss value. your learning rate might be too large, try to set a lower learing rate

Demo

I have not tested it on VOC dataset for I just reimplemented it for learning purpose, but there still provide a jupyter notebook for you to see the result,download the pretrained weights from https://drive.google.com/drive/folders/1XN-CXifL-2xilx9y8sb3Qmog_sbzW0k-?usp=sharing or use your own weights

jupyter notebook

then go to localhost:8888 by default to see the demo.

Eval on VOC2007

Now I provide code to eval on VOC2007 testset, I use Detectron's voc_eval.py to calculate MAP, to eval your model, just run

python eval.py --model=weights/loss-1220.37.pth --save_folder=result

MAP result will show in your screen

something to notice
--model is your model checkpoint to eval, after running those script a annotations_cache folder and a result(--save_folder) folder will show in this workspace. result folder contains prediction for each class.

Results

Implementation	mAP
origin paper	0.772
this repo(eval using unofficial voc_eval code)	0.73-0.75

References

Wei Liu, et al. "SSD: Single Shot MultiBox Detector." ECCV2016.
The code were mainly inspired by Those two repo, thanks for them for shareing us so elegant work
- pytorch.ssd
- Chainer

qingfenghcy/simple-ssd-for-beginners