A strong YOLOv3 PyTorch

Original YOLOv3:

data AP AP50 AP75 AP_S AP_M AP_L
YOLOv3-320 COCO test-dev 28.2 51.5 - - - -
YOLOv3-416 COCO test-dev 31.0 55.3 - - - -
YOLOv3-608 COCO test-dev 33.0 57.0 34.4 18.3 35.4 41.9

Our YOLOv3_PyTorch:

data AP AP50 AP75 AP_S AP_M AP_L
YOLOv3-320 COCO test-dev 33.1 54.1 34.5 12.1 34.5 49.6
YOLOv3-416 COCO test-dev 36.0 57.4 37.0 16.3 37.5 51.1
YOLOv3-608 COCO test-dev 37.6 59.4 39.9 20.4 39.9 48.2

Stronger and better, right?

Without any bells and whistles, my YOLOv3 exceeds original YOLOv3.

I just trained once as I have only one GPU.

Anyone can reproduce my results with my codes.

So, just have fun !!

Before the start

Hi, everyone !

Before you start to try this excellent project, I must tell you somthing:

When you run my train code, you will find it is slow.

With one TITAN RTX GPU, I spent 2 days/15 days training my YOLOv3 on VOC/COCO.

Oh! Come on ! Why is it soooooooooooo slow??????????????

Because the workers in my dataloader is 0 which means it uses only one thread to process input datas.

If I add more workers, it will report some errors about input size when I use multi-scale training trick.

I'm still trying to solve this trouble but I know much little about multithreading ...

So~

If you don't plan to try multi-scale training trick, just set num_workers as 8 or more. It will run faster.

For example:

python train_voc.py -v [select a model] -hr --cuda --num_workers 8

the whole project

In this project, you can enjoy:

  • yolo-v2
  • yolo-v3
  • tiny-yolo-v2
  • tiny-yolo-v3(toy model, don't care~)

What I have to say is that I don't try to 100% reproduce the whole official YOLO project, because it is really hard to me. I have not much computation resource, so I can't train my yolov3 on COCO. It will cost more than two weeks...

Recently, I made some improvement, and my yolo project is very close to official yolo models.

I will upload the new model again. Just hold on~

However, I have a qeustion: Is the mAP metric really good? Does it really suit object detection?

I find higher mAP doesn't mean better visualization...so weird.

YOLOv2

I really enjoy yolo. It is so amazing! So I try to reproduce it. And I think I achieve this goal:

size Original (darknet) Ours (pytorch) 160peochs Ours (pytorch) 250epochs
VOC07 test 416 76.8 76.0 77.1
VOC07 test 544 78.6 77.0 78.1

With 160 training epochs, my yolo-v2 only gets 76.0 mAP with 416 input size and 77.0 mAP with 544 input size. To be better, I add another 90 epochs. With 250 training epochs, my yolo-v2 performs very well !

During testing stage, I set conf thresh as 0.001 and set nms thresh as 0.5 to obtain above results. To make my model faster, I set conf thresh as 0.01. With this higher conf thresh, my yolo-v2 still performs very well and gets 77.0 mAP with 416 input size and 78.0 mAP with 544 input size.

I visualize some detection results whose score is over 0.3 on VOC 2007 test:

Image Image Image Image Image Image Image Image Image

COCO:

data AP AP50 AP75 AP_S AP_M AP_L
Original (darknet) COCO test-dev 21.6 44.0 19.2 5.0 22.4 35.5
Ours (pytorch) COCO test-dev 26.8 46.6 26.8 5.8 27.4 45.2
Ours (pytorch) COCO eval 26.6 46.0 26.7 5.9 27.8 47.1

I train my YOLOv2 with 250 epochs on COCO. From the above table, my YOLOv2 is better, right?

Just enjoy it !

Tricks

Tricks in official paper:

  • batch norm
  • hi-res classifier
  • convolutional
  • anchor boxes
  • new network
  • dimension priors
  • location prediction
  • passthrough
  • multi-scale
  • hi-red detector

In TITAN Xp, my yolo-v2 runs at 100+ FPS, so it's very fast. I have no any TITAN X GPU, and I can't run my model in a X GPU. Sorry, guys~

Before I tell you how to use this project, I must say one important thing about difference between origin yolo-v2 and mine:

  • For data augmentation, I copy the augmentation codes from the https://github.com/amdegroot/ssd.pytorch which is a superb project reproducing the SSD. If anyone is interested in SSD, just clone it to learn !(Don't forget to star it !)

So I don't write data augmentation by myself. I'm a little lazy~~

My loss function and groundtruth creator both in the tools.py, and you can try to change any parameters to improve the model.

Next, I plan to train my yolo-v2 on COCO.

YOLOv3

Besides YOLOv2, I also try to reproduce YOLOv3. Before this, I rebuilt a darknet53 network with PyTorch and pretrained it on ImageNet, so I don't select official darknet53 model file...Oh! I forgot to you guys that my darknet19 used in my YOLOv2 is also rebuilt by myself with PyTorch. The top-1 performance of my darknet19 and darknet53 is following:

size Original (darknet) Ours (pytorch)
darknet19 224 72.9 72.96
darknet19 448 76.5 75.52
darknet53 224 77.2 75.42
darknet53 448 - 77.76

Looks good !

I have only one GPU meaning training YOLOv3 on COCO will cost my lots of time(more than two weeks), so I only train my YOLOv3 on VOC. The resule is shown:

size Original (darknet) Ours (pytorch) 250epochs
VOC07 test 416 80.25 81.4

I use the same training strategy to my YOLOv2. My data-processing code is a little different from official YOLOv3. For more details, you can check my code files.

COCO:

Original YOLOv3:

data AP AP50 AP75 AP_S AP_M AP_L
YOLOv3-320 COCO test-dev 28.2 51.5 - - - -
YOLOv3-416 COCO test-dev 31.0 55.3 - - - -
YOLOv3-608 COCO test-dev 33.0 57.0 34.4 18.3 35.4 41.9

Our YOLOv3_PyTorch:

data AP AP50 AP75 AP_S AP_M AP_L
YOLOv3-320 COCO test-dev 33.1 54.1 34.5 12.1 34.5 49.6
YOLOv3-416 COCO test-dev 36.0 57.4 37.0 16.3 37.5 51.1
YOLOv3-608 COCO test-dev 37.6 59.4 39.9 20.4 39.9 48.2

My YOLOv3 is very stronger and better, right?

I also visualize some detection results whose score is over 0.3 on COCO 2017-val:

Image Image Image Image Image Image Image Image Image Image Image

HAHAHAHA!

So, just have fun !

Slim YOLOv2

I build a very simple lightweight backbone: darknet_tiny

Image

I replace the darknet19 used in YOLOv2 with darknet_tiny.

My SlimYOLOv2 is fast and strong. On VOC, it gets 70.7 mAP and 100+ FPS on 1660ti GPU.

Just enjoy it.

And, I'm still trying to make it faster without too much drop of precision.

Tiny YOLOv3

We evaluate our TinyYOLOv3 on COCO-val with inputsize 608:

data AP AP50 AP75 AP_S AP_M AP_L
(official) YOLOv3-tiny COCO test-dev - 33.1 - - - -
(Our) YOLOv3-tiny COCO val 15.9 33.8 12.8 7.6 17.7 22.4

Installation

  • Pytorch-gpu 1.1.0/1.2.0/1.3.0
  • Tensorboard 1.14.
  • opencv-python, python3.6/3.7

Dataset

As for now, I only train and test on PASCAL VOC2007 and 2012.

VOC Dataset

I copy the download files from the following excellent project: https://github.com/amdegroot/ssd.pytorch

I have uploaded the VOC2007 and VOC2012 to BaiDuYunDisk, so for researchers in China, you can download them from BaiDuYunDisk:

Link:https://pan.baidu.com/s/1tYPGCYGyC0wjpC97H-zzMQ

Password:4la9

You will get a VOCdevkit.zip, then what you need to do is just to unzip it and put it into data/. After that, the whole path to VOC dataset is data/VOCdevkit/VOC2007 and data/VOCdevkit/VOC2012.

Download VOC2007 trainval & test

# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2007.sh # <directory>

Download VOC2012 trainval

# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2012.sh # <directory>

MSCOCO Dataset

I copy the download files from the following excellent project: https://github.com/DeNA/PyTorch_YOLOv3

Download MSCOCO 2017 dataset

Just run sh data/scripts/COCO2017.sh. You will get COCO train2017, val2017, test2017.

Train

VOC

python train_voc.py -v [select a model] -hr -ms --cuda

You can run python train_voc.py -h to check all optional argument.

By default, I set num_workers in pytorch dataloader as 0 to guarantee my multi-scale trick. But the trick can't work when I add more wokers. I know little about multithreading. So sad...

COCO

python train_coco.py -v [select a model] -hr -ms --cuda

Test

VOC

python test_voc.py -v [select a model] --trained_model [ Please input the path to model dir. ] --cuda

COCO

python test_coco.py -v [select a model] --trained_model [ Please input the path to model dir. ] --cuda

Evaluation

VOC

python eval_voc.py -v [select a model] --train_model [ Please input the path to model dir. ] --cuda

COCO

To run on COCO_val:

python eval_coco.py -v [select a model] --train_model [ Please input the path to model dir. ] --cuda

To run on COCO_test-dev(You must be sure that you have downloaded test2017):

python eval_coco.py -v [select a model] --train_model [ Please input the path to model dir. ] --cuda -t

You will get a .json file which can be evaluated on COCO test server.

You can run python train_voc.py -h to check all optional argument.

By default, I set num_workers in pytorch dataloader as 0 to guarantee my multi-scale trick. But the trick can't work when I add more wokers. I know little about multithreading. So sad...

Train yourself

you can give a path to trained model to --resume. For example:

python train_voc.py -v yolo_v3 -ms --cuda --resume weights/coco/yolo_v3/yolo_v3_260epoch_416_57.6_36.0.pth

Remember, you need to change the name of pred layer in my YOLOv3 model code file. For example:

self.pred_1 -> self.pred_1_1
self.pred_2 -> self.pred_2_1
self.pred_3 -> self.pred_3_1

Otherwise, there will be some errors~

Remember again!

If you want to change input size, you need to open data/config.py and give a new size to 'min_dim'. For example:

'min_dim': [416, 416], -> 'min_dim': [640, 640],