Repository for Single Shot MultiBox Detector and its variants, implemented with pytorch, python3.
Currently, it contains these features:
- Multiple SSD Variants: ssd, rfb, fssd, ssd-lite, rfb-lite, fssd-lite
- Multiple Base Network: VGG, Mobilenet V1/V2
- Free Image Size
- Visualization with tensorboard-pytorch: training loss, eval loss/mAP, example archor boxs.
This repo is depended on the work of ssd.pytorch, faster-rcnn.pytorch, RFBNet, Detectron and Tensorflow Object Detection API. Thanks for there works.
- install pytorch
- install requirements by
pip install -r ./requirements.txt
To train, test and demo some specific model. Please run the relative file in folder with the model configure file, like:
python train.py --cfg=./experiments/cfgs/rfb_lite_mobilenetv2_train_voc.yml
Change the configure file based on the note in config_parse.py
VOC2007 | YOLO_v2 | YOLO_v3 | SSD | RFB | FSSD |
---|---|---|---|---|---|
Darknet53 | 79.3% | ||||
Darknet19 | 78.4% | ||||
Resnet50 | 79.7% | 81.2% | |||
VGG16 | 76.0% | 80.5% | 77.8% | ||
MobilenetV1 | 74.7% | 78.2% | 72.7% | 73.7% | 78.4% |
MobilenetV2 | 72.0% | 75.8% | 73.2% | 73.4% | 76.7% |
COCO2017 | YOLO_v2 | YOLO_v3 | SSD | RFB | FSSD |
---|---|---|---|---|---|
Darknet53 | 27.3% | ||||
Darknet19 | 21.6% | ||||
Resnet50 | 25.1% | 26.5% | 27.2% | ||
VGG16 | 25.4% | 25.5% | 27.2% | ||
MobilenetV1 | 21.5% | 25.7% | 18.8% | 19.1% | 24.2% |
MobilenetV2 | 20.4% | 24.0% | 18.5% | 18.5% | 22.2% |
Net InferTime* (fp32) | YOLO_v2 | YOLO_v3 | SSD | RFB | FSSD |
---|---|---|---|---|---|
Darknet53 | 5.6ms | ||||
Darknet19 | 1.9ms | ||||
Resnet50 | |||||
VGG16 | 1.78ms | 4.20ms | 1.98ms | ||
MobilenetV1 | 3.8ms | 2.87ms | 3.84ms | 2.62ms | |
MobilenetV2 | 5.1ms | 4.18ms | 5.28ms | 4.02ms |
(*-only calculate the all network inference time, without pre-processing & post-processing. In fact, the speed of vgg is super impress me. Maybe it is caused by MobilenetV1 and MobilenetV2 is using -lite structure, which uses the seperate conv in the base and extra layers.)
-
visualize the network graph (terminal) -tensorboard has bugs.
-
visualize the loss during the training progress and meanAP during the eval progress (terminal & tensorboard)
-
visualize archor box for each feature extractor (tensorboard)
-
visualize the pr curve for different classes and anchor matching strategy for different properties during evaluation (tensorboard) (*guess the dataset in the figure, coco or voc?)
-
visualize featuremap and grads (not satisfy me, does not give me any information. any suggestions? )
- add DSSDs: DSSD FPN TDM
- test the multi-resolution traning
- add rotation for prerprocess
- test focal loss
- add resnet, xception, inception
- figure out the problem of visualize graph
- speed up preprocess part (any suggestion?)
- speed up postprocess part (any suggestion?) huge bug!!!
- add network visualization based on pytorch-cnn-visualizations
- object detection