Platform:Ubuntu 18.04.4
1.pytorch==1.4.0
2.torchvision==0.5.0
3.python==3.6.9
4.numpy==1.17.0
5.opencv-python==4.1.1.26
6.tqdm==4.46.0
7.thop==0.0.31
8.Cython==0.29.19
9.matplotlib==3.2.1
10.pycocotools==2.0.0
11.apex==0.1
If you use python3.7,please use the following orders to install apex:
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
If the above command fails to install apex,you can use the following orders to install apex:
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir ./
Using apex to train can reduce video memory usage by 25%-30%, but the training speed will be slower, the trained model has the same performance as not using apex.
You can download all my pretrained models from here:https://drive.google.com/drive/folders/1rewWULfXsvE0voA-A_ooTWwadq9lsk3X?usp=sharing .
If you are in China,you can download from here:
链接:https://pan.baidu.com/s/1b6m70EQclE8aG-A2tkWrhQ
提取码:aieg
If you want to reproduce my imagenet pretrained models,you need download ILSVRC2012 dataset,and make sure the folder architecture as follows:
ILSVRC2012
|
|-----train----1000 sub classes folders
|
|-----val------1000 sub classes folders
Please make sure the same class has same class folder name in train and val folders.
If you want to reproduce my COCO pretrained models,you need download COCO2017 dataset,and make sure the folder architecture as follows:
COCO2017
|
|-----annotations----all label jsons
|
| |----train2017
|----images------|----val2017
|----test2017
If you want to reproduce my VOC pretrained models,you need download VOC2007+VOC2012 dataset,and make sure the folder architecture as follows:
VOCdataset
| |----Annotations
| |----ImageSets
|----VOC2007------|----JPEGImages
| |----SegmentationClass
| |----SegmentationObject
|
| |----Annotations
| |----ImageSets
|----VOC2012------|----JPEGImages
| |----SegmentationClass
| |----SegmentationObject
If you want to reproduce my experiment result,just enter a category experiments folder,then enter a specific experiment folder.Each experiment folder has it's own config.py and train.py.
If the experiment use nn.parallel to train,you should add this in train.py to specify the GPU for training:
os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"
then run this command to train:
python train.py
If the experiment use nn.DistributedDataParallel to train,you should add this in train.py to specify the GPU for training:
os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"
then run this command to train:
python -m torch.distributed.launch --nproc_per_node=2 --master_addr 127.0.0.1 --master_port 20001 train.py
Please make sure the nproc_per_node number is correct and master_addr/master_port are different from other experiments.
Trained on COCO2017_train, tested on COCO2017_val.
mAP is IoU=0.5:0.95,area=all,maxDets=100,mAP(COCOeval,stats[0]). mAR is IoU=0.5:0.95,area=all,maxDets=100,mAR(COCOeval,stats[8]).
My size=667 is equal to resize=400 in RetinaNet paper(https://arxiv.org/pdf/1708.02002.pdf) ,my resize=1000 is equal to resize=600 in RetinaNet paper.
Network | resize | batch | gpu-num | apex | syncbn | epoch5-mAP-mAR-loss | epoch10-mAP-mAR-loss | epoch12-mAP-mAR-loss |
---|---|---|---|---|---|---|---|---|
ResNet50-RetinaNet | 667 | 24 | 2 | yes | no | 0.253,0.361,0.61 | 0.287,0.398,0.51 | 0.293,0.401,0.49 |
ResNet101-RetinaNet | 667 | 16 | 2 | yes | no | 0.254,0.362,0.60 | 0.290,0.398,0.51 | 0.296,0.402,0.48 |
ResNet50-RetinaNet | 1000 | 16 | 4 | yes | no | 0.305,0.425,0.55 | 0.306,0.429,0.55 | 0.333,0.456,0.46 |
For ResNet50-RetinaNet-resize1000 training,I use ResNet50-RetinaNet-resize667 as a pretrained model parameters to initialize the ResNet50-RetinaNet-resize1000.
For ResNet50-RetinaNet-resize667,the per image inference time = 116 ms(batch=1,use one GTX 1070 Max-Q).
Network | resize | batch | gpu-num | apex | syncbn | epoch5-mAP-mAR-loss | epoch10-mAP-mAR-loss | epoch12-mAP-mAR-loss | epoch15-mAP-mAR-loss | epoch20-mAP-mAR-loss | epoch24-mAP-mAR-loss |
---|---|---|---|---|---|---|---|---|---|---|---|
ResNet50-FCOS | 667 | 32 | 2 | yes | no | 0.162,0.289,1.31 | 0.226,0.342,1.21 | 0.248,0.370,1.20 | 0.217,0.343,1.17 | 0.282,0.409,1.14 | 0.286,0.409,1.12 |
ResNet101-FCOS | 667 | 24 | 2 | yes | no | 0.206,0.325,1.29 | 0.237,0.359,1.20 | 0.263,0.380,1.18 | 0.277,0.400,1.15 | 0.260,0.385,1.13 | 0.291,0.416,1.10 |
ResNet50-FCOS | 1000 | 32 | 4 | yes | no | 0.305,0.443,1.15 | 0.315,0.451,1.14 | / | / | / | / |
My size=667 is equal to resize=400 in FCOS paper(https://arxiv.org/pdf/1904.01355.pdf) ,my resize=1000 is equal to resize=600 in FCOS paper.
This FCOS implementation doesn't contains GN and CenterSample.
For ResNet50-FCOS-resize1000 training,I use ResNet50-FCOS-resize667 as a pretrained model parameters to initialize the ResNet50-FCOS-resize1000.
For ResNet50-FCOS-resize667,the per image inference time = 103 ms(batch=1,use one GTX 1070 Max-Q).
You can see more model training details in detection_experiments/experiment_folder/.
Trained on VOC2007 trainval + VOC2012 trainval, tested on VOC2007,using 11-point interpolated AP.
Network | resize | batch | gpu-num | apex | syncbn | epoch5-mAP-loss | epoch10-mAP-loss | epoch15-mAP-loss | epoch20-mAP-loss |
---|---|---|---|---|---|---|---|---|---|
ResNet50-RetinaNet | 667 | 24 | 2 | yes | no | 0.660,0.62 | 0.705,0.44 | 0.723,0.35 | 0.732,0.30 |
ResNet50-RetinaNet-usecocopre | 667 | 24 | 2 | yes | no | 0.789,0.34 | 0.780,0.26 | 0.776,0.22 | 0.770,0.19 |
You can see more model training details in detection_experiments/experiment_folder/.
Training in nn.parallel mode result:
Network | warm up | lr decay | total epochs | Top-1 error |
---|---|---|---|---|
ResNet-18 | no | multistep | 200 | 21.59 |
ResNet-34 | no | multistep | 200 | 21.16 |
ResNet-50 | no | multistep | 200 | 22.12 |
ResNet-101 | no | multistep | 200 | 19.84 |
ResNet-152 | no | multistep | 200 | 19.01 |
You can see more model training details in cifar100_experiments/resnet50cifar/.
Network | warm up | lr decay | total epochs | Top-1 error |
---|---|---|---|---|
ResNet-18 | no | multistep | 100 | 29.684 |
ResNet-34-half | no | multistep | 100 | 32.528 |
ResNet-34 | no | multistep | 100 | 26.264 |
ResNet-50-half | no | multistep | 100 | 27.934 |
ResNet-50 | no | multistep | 100 | 23.488 |
ResNet-101 | no | multistep | 100 | 22.276 |
ResNet-152 | no | multistep | 100 | 21.436 |
EfficientNet-b0 | yes,5 epochs | consine | 100 | 24.492 |
EfficientNet-b1 | yes,5 epochs | consine | 100 | 23.092 |
EfficientNet-b2 | yes,5 epochs | consine | 100 | 22.224 |
EfficientNet-b3 | yes,5 epochs | consine | 100 | 21.884 |
DarkNet-19 | no | multistep | 100 | 26.132 |
DarkNet-53 | no | multistep | 100 | 22.992 |
VovNet-19-slim-depthwise-se | no | multistep | 100 | 33.276 |
VovNet-19-slim-se | no | multistep | 100 | 30.646 |
VovNet-19-se | no | multistep | 100 | 25.364 |
VovNet-39-se | no | multistep | 100 | 22.662 |
VovNet-57-se | no | multistep | 100 | 22.014 |
VovNet-99-se | no | multistep | 100 | 21.608 |
RegNetY-200MF | yes,5 epochs | consine | 100 | 29.904 |
RegNetY-400MF | yes,5 epochs | consine | 100 | 26.210 |
RegNetY-600MF | yes,5 epochs | consine | 100 | 25.276 |
RegNetY-800MF | yes,5 epochs | consine | 100 | 24.006 |
RegNetY-1.6GF | yes,5 epochs | consine | 100 | 22.692 |
RegNetY-3.2GF | yes,5 epochs | consine | 100 | 21.092 |
RegNetY-4.0GF | yes,5 epochs | consine | 100 | 21.684 |
RegNetY-6.4GF | yes,5 epochs | consine | 100 | 21.230 |
All nets are trained by input size 224x224 except DarkNet(input size 256x256) and EfficientNet.
For training resnet50 with batch_size=256,you need at least 4 2080ti gpus,and need about three or four days.
Network | sync-BN | warm up | lr decay | total epochs | Top-1 error |
---|---|---|---|---|---|
ResNet-50 | no | no | multistep | 100 | 23.72 |
ResNet-50 | yes | no | multistep | 100 | 25.44 |
You can see more model training details in imagenet_experiments/experiment_folder/.
If you find my work useful in your research, please consider citing:
@inproceedings{zgcr,
title={pytorch-ImageNet-CIFAR-COCO-VOC-training},
author={Chaoran Zhuge},
year={2020}
}