/ssds.pytorch

Repository for Single Shot MultiBox Detector and its variants, implemented with pytorch, python3.

Primary LanguagePythonMIT LicenseMIT

ssds.pytorch

Repository for Single Shot MultiBox Detector and its variants, implemented with pytorch, python3.

Currently, it contains these features:

  • Multiple SSD Variants: ssd, rfb, fssd, ssd-lite, rfb-lite, fssd-lite
  • Multiple Base Network: VGG, Mobilenet V1/V2
  • Free Image Size
  • Visualization with tensorboard-pytorch: training loss, eval loss/mAP, example archor boxs.

This repo is depended on the work of ssd.pytorch, faster-rcnn.pytorch, RFBNet, Detectron and Tensorflow Object Detection API. Thanks for there works.

Table of Contents

Installation

  1. install pytorch
  2. install requirements by pip install -r ./requirements.txt

Usage

To train, test and demo some specific model. Please run the relative file in folder with the model configure file, like:

python train.py --cfg=./experiments/cfgs/rfb_lite_mobilenetv2_train_voc.yml

Change the configure file based on the note in config_parse.py

Performance

VOC2007 YOLO_v2 YOLO_v3 SSD RFB FSSD
Darknet53 79.3%
Darknet19 78.4%
Resnet50 79.7% 81.2%
VGG16 76.0% 80.5% 77.8%
MobilenetV1 74.7% 78.2% 72.7% 73.7% 78.4%
MobilenetV2 72.0% 75.8% 73.2% 73.4% 76.7%
COCO2017 YOLO_v2 YOLO_v3 SSD RFB FSSD
Darknet53 27.3%
Darknet19 21.6%
Resnet50 25.1% 26.5% 27.2%
VGG16 25.4% 25.5% 27.2%
MobilenetV1 21.5% 25.7% 18.8% 19.1% 24.2%
MobilenetV2 20.4% 24.0% 18.5% 18.5% 22.2%
Net InferTime* (fp32) YOLO_v2 YOLO_v3 SSD RFB FSSD
Darknet53 5.6ms
Darknet19 1.9ms
Resnet50
VGG16 1.78ms 4.20ms 1.98ms
MobilenetV1 3.8ms 2.87ms 3.84ms 2.62ms
MobilenetV2 5.1ms 4.18ms 5.28ms 4.02ms

(*-only calculate the all network inference time, without pre-processing & post-processing. In fact, the speed of vgg is super impress me. Maybe it is caused by MobilenetV1 and MobilenetV2 is using -lite structure, which uses the seperate conv in the base and extra layers.)

Visualization

  • visualize the network graph (terminal) -tensorboard has bugs. graph

  • visualize the loss during the training progress and meanAP during the eval progress (terminal & tensorboard) train process

  • visualize archor box for each feature extractor (tensorboard) archor box

  • visualize the preprocess steps for training (tensorboard) preprocess

  • visualize the pr curve for different classes and anchor matching strategy for different properties during evaluation (tensorboard) pr_curve (*guess the dataset in the figure, coco or voc?)

  • visualize featuremap and grads (not satisfy me, does not give me any information. any suggestions? ) feature_map_visualize

TODO

  • add DSSDs: DSSD FPN TDM
  • test the multi-resolution traning
  • add rotation for prerprocess
  • test focal loss
  • add resnet, xception, inception
  • figure out the problem of visualize graph
  • speed up preprocess part (any suggestion?)
  • speed up postprocess part (any suggestion?) huge bug!!!

Reference