This is a PyTorch re-implementation of YOLOv4 architecture based on the official darknet implementation AlexeyAB/darknet with PASCAL VOC, COCO and Customer dataset
name | train Dataset | test Dataset | test size | mAP | inference time(ms) | params(M) | model link |
---|---|---|---|---|---|---|---|
mobilenetv2-YOLOV4 | VOC trainval(07+12) | VOC test(07) | 416 | 0.851 | 11.29 | 46.34 | args |
Mobilenetv3-YOLOv4 is arriving!(You only need to change the MODEL_TYPE in config/yolov4_config.py)
This repo add some useful attention methods in backbone.The following pictures illustrate such thing:
- SEnet(CVPR 2017)
- CBAM(CVPR 2018)
This repo is simple to use,easy to read and uncomplicated to improve compared with others!!!
- Nvida GeForce RTX 2080TI
- CUDA10.0
- CUDNN7.0
- windows or linux
- python 3.6
- DO-Conv(arxiv2020)(torch>=1.2)
- Attention
- fp_16 training
- Mish
- Custom data
- Data Augment (RandomHorizontalFlip, RandomCrop, RandomAffine, Resize)
- Multi-scale Training (320 to 640)
- focal loss
- CIOU
- Label smooth
- Mixup
- cosine lr
Run the installation script to install all the dependencies. You need to provide the conda install path (e.g. ~/anaconda3) and the name for the created conda environment (here YOLOv4-pytorch
).
pip3 install -r requirements.txt --user
Note: The install script has been tested on an Ubuntu 18.04 and Window 10 system. In case of issues, check the detailed installation instructions.
git clone github.com/argusswift/YOLOv4-pytorch.git
Update the "PROJECT_PATH"
in the config/yolov4_config.py.
# Download the data.
cd $HOME/data
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
# Extract the data.
tar -xvf VOCtrainval_11-May-2012.tar
tar -xvf VOCtrainval_06-Nov-2007.tar
tar -xvf VOCtest_06-Nov-2007.tar
- Download links:{VOC 2012_trainval 、VOC 2007_trainval、VOC2007_test}、
#step1: download the following data and annotation
2017 Train images [118K/18GB]
2017 Val images [5K/1GB]
2017 Test images [41K/6GB]
2017 Train/Val annotations [241MB]
#step2: arrange the data to the following structure
COCO
---train
---test
---val
---annotations
- Download links:{train2017_img 、train2017_ann 、val2017_img 、val2017_ann 、test2017_img 、test2017_list }
- Put them in the dir, and update the
"DATA_PATH"
in the config/yolov4_config.py. - (for COCO) Use coco_to_voc.py to transfer COCO datatype to VOC datatype.
- Convert data format :use utils/voc.py or utils/coco.py convert the pascal voc *.xml format (COCO *.json format)to *.txt format (Image_path xmin0,ymin0,xmax0,ymax0,class0 xmin1,ymin1,xmax1,ymax1,class1 ...).
- Darknet pre-trained weight : yolov4
- Mobilenet pre-trained weight : mobilenetv2(code:args),mobilenetv3(code:args)
- Make dir
weight/
in the YOLOv4 and put the weight file in. - set MODEL_TYPE in config/yolov4_config.py when you run training program.
- Put pictures of your dataset into the JPEGImages folder, and Annotations files into the Annotations folder.
- Use the xml_to_txt.py file to write the list of training and test files to ImageSets/Main/*.txt.
- Convert data format :use utils/voc.py or utils/coco.py convert the pascal voc *.xml format (COCO *.json format)to *.txt format (Image_path xmin0,ymin0,xmax0,ymax0,class0 xmin1,ymin1,xmax1,ymax1,class1 ...).
Run the following command to start training and see the details in the config/yolov4_config.py
and you should set DATA_TYPE is VOC or COCO when you run training program.
CUDA_VISIBLE_DEVICES=0 nohup python -u train.py --weight_path weight/yolov4.weights --gpu_id 0 > nohup.log 2>&1 &
Also * It supports to resume training adding --resume
, it will load last.pt
automaticly by using commad
CUDA_VISIBLE_DEVICES=0 nohup python -u train.py --weight_path weight/last.pt --gpu_id 0 > nohup.log 2>&1 &
Modify your detecte img path:DATA_TEST=/path/to/your/test_data # your own images
for VOC dataset:
CUDA_VISIBLE_DEVICES=0 python3 eval_voc.py --weight_path weight/best.pt --gpu_id 0 --visiual $DATA_TEST --eval --mode det
for COCO dataset:
CUDA_VISIBLE_DEVICES=0 python3 eval_coco.py --weight_path weight/best.pt --gpu_id 0 --visiual $DATA_TEST --eval --mode det
The images can be seen in the output/
. you could see pictures like follows:
Modify:
- video_path:/path/to/your/video
- weight_path:/path/to/your/weight
- output_dir:/path/to/save/dir
CUDA_VISIBLE_DEVICES=0 python3 video_test.py --weight_path best.pt --gpu_id 0 --video_path video.mp4 --output_dir --output_dir
Modify your evaluate dataset path:DATA_PATH=/path/to/your/test_data # your own images
for VOC dataset:
CUDA_VISIBLE_DEVICES=0 python3 eval_voc.py --weight_path weight/best.pt --gpu_id 0 --visiual $DATA_TEST --eval --mode val
If you want to see the picture above, you should use follow commands:
# To get ground truths of your dataset
python3 utils/get_gt_txt.py
# To plot P-R curve and calculate mean average precision
python3 utils/get_map.py
Modify your evaluate dataset path:DATA_PATH=/path/to/your/test_data # your own images
CUDA_VISIBLE_DEVICES=0 python3 eval_coco.py --weight_path weight/best.pt --gpu_id 0 --visiual $DATA_TEST --eval --mode val
type=bbox
Running per image evaluation... DONE (t=0.34s).
Accumulating evaluation results... DONE (t=0.08s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.438
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.607
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.469
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.253
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.486
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.567
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.342
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.571
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.632
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.458
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.691
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.790
python3 utils/modelsize.py
Set showatt=Ture in val_voc.py and you will see the heatmaps emerged from network' output
for VOC dataset:
CUDA_VISIBLE_DEVICES=0 python3 eval_voc.py --weight_path weight/best.pt --gpu_id 0 --visiual $DATA_TEST --eval
for COCO dataset:
CUDA_VISIBLE_DEVICES=0 python3 eval_coco.py --weight_path weight/best.pt --gpu_id 0 --visiual $DATA_TEST --eval
The heatmaps can be seen in the output/
like this: