/DetNet_pytorch

An implementation of DetNet: A Backbone network for Object Detection.

Primary LanguagePythonMIT LicenseMIT

An implementation of DetNet: A Backbone network for Object Detection. Due to the short time, I only trained and tested on pascal voc dataset. It proved that the performance of detnet59 is indeed better than FPN101.

Introduction

Firstly, I spent about one week training detnet59 on the ImageNet dataset .The classification performance of detnet59 is a little better than the original resnet50. Then i used the pretrained detnet59 to train and test on pascal voc.

Based on FPN_Pytorch, i change FPN101 to detnet59.

Update 2019/01/01

Fix bugs in demo.py. Now you can run demo.py. Note the default demo.py merely support pascal_voc categories. You need to change the pascal_classes in demo.py to adapt your own dataset. If you want to know more details, please see the usage part.

Update 2018/8/21

train and test on COCO2017 !

Update

Adding soft_nms. Without requiring any re-training of existing models. You only need to use soft_nms during testing to bring performance improvements.

Benchmarking

I benchmark this code thoroughly on pascal voc2007 and 07+12. Below are the results:

0). ImageNet(test on validation dataset)

backbone Top1 error
pytorch resnet50 23.9
detnet59 in this code 23.8
detnet59 in the original paper 23.5

1). PASCAL VOC 2007 (Train/Test: 07trainval/07test, scale=600, ROI Align)

model(FPN) GPUs Batch Size lr lr_decay max_epoch Speed/epoch Memory/GPU mAP
ResNet-101 1 GTX 1080 (Ti) 2 1e-3 10 12 1.44hr 6137MB 75.7
DetNet59 1 GTX 1080 (Ti) 2 1e-3 10 12 1.07hr 5412MB 75.9

2). PASCAL VOC 07+12 (Train/Test: 07+12trainval/07test, scale=600, ROI Align)

model(FPN) GPUs Batch Size lr lr_decay max_epoch Speed/epoch Memory/GPU mAP
ResNet-101 1 GTX 1080 (Ti) 1 1e-3 10 12 3.96hr 9011MB 80.5
DetNet59 1 GTX 1080 (Ti) 1 1e-3 10 12 2.33hr 8015MB 80.7
ResNet-101(using soft_nms when testing) 1 GTX 1080 (Ti) \ \ \ \ \ \ 81.2
DetNet59(using soft_nms when testing) 1 GTX 1080 (Ti) \ \ \ \ \ \ 81.6

3). COCO2017 (Train/Test:COCO2017train/COCO2017val, scale=800, max_size=1200,ROI Align)

model #GPUs batch size lr lr_decay max_epoch time/epoch mem/GPU mAP
DetNet59 2 4 4e-3 4 11 \ 9000 36.0

Preparation

First of all, clone the code

git clone https://github.com/guoruoqian/DetNet_Pytorch.git

Then, create a folder:

cd DetNet_Pytorch && mkdir data

prerequisites

  • Python 2.7 or 3.6
  • Pytorch 0.2.0 or higher(not support pytorch version >=0.4.0)
  • CUDA 8.0 or higher
  • tensorboardX

Data Preparation

  • VOC2007: Please follow the instructions in py-faster-rcnn to prepare VOC datasets. Actually, you can refer to any others. After downloading the data, creat softlinks in the folder data/.
  • VOC 07 + 12: Please follow the instructions in YuwenXiong/py-R-FCN . I think this instruction is more helpful to prepare VOC datasets.

Pretrained Model

 You can download the detnet59 model which i trained on ImageNet from:

Download it and put it into the data/pretrained_model/.

Compilation

As pointed out by ruotianluo/pytorch-faster-rcnn, choose the right -arch in make.sh file, to compile the cuda code:

GPU model Architecture
TitanX (Maxwell/Pascal) sm_52
GTX 960M sm_50
GTX 1080 (Ti) sm_61
Grid K520 (AWS g2.2xlarge) sm_30
Tesla K80 (AWS p2.xlarge) sm_37

Install all the python dependencies using pip:

pip install -r requirements.txt

Compile the cuda dependencies using following simple commands:

cd lib
sh make.sh

It will compile all the modules you need, including NMS, ROI_Pooing, ROI_Align and ROI_Crop. The default version is compiled with Python 2.7, please compile by yourself if you are using a different python version.

Usage

train voc2007:

CUDA_VISIBLE_DEVICES=3 python3 trainval_net.py exp_name --dataset pascal_voc --net detnet59 --bs 2 --nw 4 --lr 1e-3 --epochs 12 --save_dir weights --cuda --use_tfboard True

test voc2007:

CUDA_VISIBLE_DEVICES=3 python3 test_net.py exp_name --dataset pascal_voc --net detnet59 --checksession 1 --checkepoch 7 --checkpoint 5010 --cuda --load_dir weights

run demo.py :

Before run demo, you must make dictionary 'demo_images' and put images (VOC images) in it. You can download the pretrained model  listed in above tables.

CUDA_VISIBLE_DEVICES=0 python3 demo.py exp_name --dataset pascal_voc --net detnet59 --checksession 1 --checkepoch 7 --checkpoint 5010 --cuda --load_dir weights --image_dir demo_images --result_dir vis_results

using soft_nms when testing:

CUDA_VISIBLE_DEVICES=3 python3 test_net.py exp_name --dataset pascal_voc --net detnet59 --checksession 1 --checkepoch 7 --checkpoint 5010 --cuda --load_dir weights --soft_nms

Before training voc07+12, you can must set ASPECT_CROPPING in detnet59.yml False, or you will encounter some error during the training.

train voc07+12:

CUDA_VISIBLE_DEVICES=3 python3 trainval_net.py exp_name2 --dataset pascal_voc_0712 --net detnet59 --bs 1 --nw 4 --lr 1e-3 --epochs 12 --save_dir weights --cuda --use_tfboard True

train coco:

CUDA_VISIBLE_DEVICES=6,7 python3 trainval_net.py detnetv1.0 --dataset coco --net detnet59 --bs 4 --nw 4 --lr 4e-3 --epochs 12 --save_dir weights --cuda --lscale --mGPUs

test coco:

CUDA_VISIBLE_DEVICES=2 python3 test_net.py detnetv1.0 --dataset coco --net detnet59 --checksession 1 --checkepoch 7 --checkpoint 58632 --cuda --load_dir weights --ls

TODO

  • Train and test on COCO(Done)