Update 2022-01-06

I have always used batchsize to scale loss:

loss = loss.sum() / batch_size

However, recently, I think it is not optimal. To address this issue, I scale loss by the number of total positive samples:

loss = loss.sum() / num_pos

After this optimization, I find some tricks which I used but not work can work now. Therefore, I am trying those tricks to make my YOLO better. Once I complete these optimizations, I will immediately upload the latest weight files.

A new and strong YOLO family

Recently, I rebuild my YOLO-Family project !!

Requirements

We recommend you to use Anaconda to create a conda environment:

conda create -n yolo python=3.6

Then, activate the environment:

conda activate yolo

Requirements:

pip install -r requirements.txt

PyTorch >= 1.1.0 and Torchvision >= 0.3.0

Come soon

My better YOLO family

This project

In this project, you can enjoy:

a new and stronger YOLOv1
a new and stronger YOLOv2
YOLOv3 with DilatedEncoder
YOLOv4 ~ (I'm try to make it better)
YOLO-Tiny
YOLO-Nano

Future work

Try to make my YOLO-v4 better.
Train my YOLOv1/YOLOv2 with ViT-Base (pretrained by MaskAutoencoder)

Weights

You can download all weights including my DarkNet-53, CSPDarkNet-53, MAE-ViT and YOLO weights from the following links.

Google Drive

Link: Hold on ...

BaiDuYun Disk

Link：https://pan.baidu.com/s/1Cin9R52wfubD4xZUHHCRjg

Password：aigz

Experiments

Tricks

Tricks in this project:

Augmentations: Flip + Color jitter + RandomCrop + Multi-scale
Model EMA
GIoU
Mosaic Augmentation for my YOLOv4
Multiple positive samples for my YOLOv4

On the COCO-val:

	Backbone	Size	FPS	AP	AP50	AP75	APs	APm	APl	GFLOPs	Params
YOLO-Nano	ShuffleNetv2-1.0x	512		21.6	40.0	20.5	7.4	22.7	32.3	1.65	1.86M
YOLO-Tiny	CSPDarkNet-Tiny	512		26.6	46.1	26.9	13.5	30.0	35.0	5.52	7.66M
YOLO-TR	ViT-B	384
YOLOv1	ResNet50	640		35.2	54.7	37.1	14.3	39.5	53.4	41.96	44.54M
YOLOv2	ResNet50	640		36.3	56.6	37.7	15.1	41.1	54.0	42.10	44.89M
YOLOv3-DE	DarkNet53	640		38.7	60.2	40.7	21.3	41.7	51.7	76.41	57.25M
YOLOv4	CSPDarkNet53	640		40.5	60.4	43.5	24.2	44.8	52.0	60.55	52.00M

The FPS of all YOLO detectors are measured on a one 2080ti GPU with 640 × 640 size.

Visualization

I will upload some visualization results:

YOLO-Nano

	FPS	AP	AP50	AP75	APs	APm	APl	GFLOPs	Params
YOLO-Nano-320		17.2	32.9	15.8	3.5	15.7	31.4	0.64	1.86M
YOLO-Nano-416		20.2	37.7	19.3	5.5	19.7	33.5	1.09	1.86M
YOLO-Nano-512		21.6	40.0	20.5	7.4	22.7	32.3	1.65	1.86M

YOLO-Tiny

	FPS	AP	AP50	AP75	APs	APm	APl	GFLOPs	Params
YOLO-Tiny-320		24.5	42.4	24.5	8.9	26.1	38.8	2.16	7.66M
YOLO-Tiny-416		25.7	44.4	25.9	11.7	27.8	36.7	3.64	7.66M
YOLO-Tiny-512		26.6	46.1	26.9	13.5	30.0	35.0	5.52	7.66M

YOLO-TR

	FPS	AP	AP50	AP75	APs	APm	APl
YOLO-TR-224
YOLO-TR-320
YOLO-TR-384

YOLOv1

	FPS	AP	AP50	AP75	APs	APm	APl
YOLOv1-320		25.4	41.5	26.0	4.2	25.0	49.8
YOLOv1-416		30.1	47.8	30.9	7.8	31.9	53.3
YOLOv1-512		33.1	52.2	34.0	10.8	35.9	54.9
YOLOv1-640		35.2	54.7	37.1	14.3	39.5	53.4

YOLOv2

	FPS	AP	AP50	AP75	APs	APm	APl
YOLOv2-320		26.8	44.1	27.1	4.7	27.6	50.8
YOLOv2-416		31.6	50.3	32.4	9.1	33.8	54.0
YOLOv2-512		34.3	54.0	35.4	12.3	37.8	55.2
YOLOv2-640		36.3	56.6	37.7	15.1	41.1	54.0

YOLOv3

Coming soon.

	FPS	AP	AP50	AP75	APs	APm	APl
YOLOv3-320
YOLOv3-416
YOLOv3-512
YOLOv3-608
YOLOv3-640

YOLOv3 with SPP

Coming soon.

	FPS	AP	AP50	AP75	APs	APm	APl
YOLOv3-SPP-320
YOLOv3-SPP-416
YOLOv3-SPP-512
YOLOv3-SPP-608
YOLOv3-SPP-640

YOLOv3 with Dilated Encoder

The DilatedEncoder is proposed by YOLOF.

	FPS	AP	AP50	AP75	APs	APm	APl
YOLOv3-320		31.1	51.1	31.7	10.2	32.6	51.2
YOLOv3-416		35.0	56.1	36.3	14.6	37.4	53.7
YOLOv3-512		37.7	59.3	39.6	17.9	40.4	54.4
YOLOv3-640		38.7	60.2	40.7	21.3	41.7	51.7

YOLOv4

Coming soon.

	FPS	AP	AP50	AP75	APs	APm	APl
YOLOv4-SPP-320
YOLOv4-SPP-416
YOLOv4-SPP-512
YOLOv4-SPP-608
YOLOv4-SPP-640

YOLOv4-exp

This is an experimental model. I am currently further optimizing my YOLOv4, using better CSPDarkNet and better training strategies.

	FPS	AP	AP50	AP75	APs	APm	APl
YOLOv4-320		36.7	55.4	38.2	15.7	39.9	57.5
YOLOv4-416		39.2	58.6	41.4	20.1	43.3	56.8
YOLOv4-512		40.5	60.1	43.1	22.8	44.5	56.1
YOLOv4-640		40.5	60.4	43.5	24.2	44.8	52.0

Dataset

VOC Dataset

I copy the download files from the following excellent project: https://github.com/amdegroot/ssd.pytorch

I have uploaded the VOC2007 and VOC2012 to BaiDuYunDisk, so for researchers in China, you can download them from BaiDuYunDisk:

Link：https://pan.baidu.com/s/1tYPGCYGyC0wjpC97H-zzMQ

Password：4la9

You will get a VOCdevkit.zip, then what you need to do is just to unzip it and put it into data/. After that, the whole path to VOC dataset is data/VOCdevkit/VOC2007 and data/VOCdevkit/VOC2012.

Download VOC2007 trainval & test

# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2007.sh # <directory>

Download VOC2012 trainval

# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2012.sh # <directory>

MSCOCO Dataset

Download MSCOCO 2017 dataset

Just run sh data/scripts/COCO2017.sh. You will get COCO train2017, val2017, test2017.

Train

For example:

python train.py --cuda \
                -d coco \
                -v yolov1 \
                -ms \
                --ema \
                --batch_size 16 \
                --root path/to/dataset/

You can run python train.py -h to check all optional argument. Or you can just run the shell file, for example:

sh train_yolov1.sh

If you have multi gpus like 8, and you put 4 images on each gpu:

python -m torch.distributed.launch --nproc_per_node=8 train.py -d coco \
                                                               --cuda \
                                                               -v yolov1 \
                                                               -ms \
                                                               --ema \
                                                               -dist \
                                                               --sybn \
                                                               --num_gpu 8 \
                                                               --batch_size 4 \
                                                               --root path/to/dataset/

Attention, --batch_size is the number of batchsize on per GPU, not all GPUs.

I have upload all training log files. For example, 1-v1.txt contains all the output information during the training YOLOv1.

It is strongly recommended that you open the training shell file to check how I train each YOLO detector.

Test

For example:

python test.py -d coco \
               --cuda \
               -v yolov1 \
               --weight path/to/weight \
               --img_size 640 \
               --root path/to/dataset/ \
               --show

Evaluation

For example

python eval.py -d coco-val \
               --cuda \
               -v yolov1 \
               --weight path/to/weight \
               --img_size 640 \
               --root path/to/dataset/

Evaluation on COCO-test-dev

To run on COCO_test-dev(You must be sure that you have downloaded test2017):

python eval.py -d coco-test \
               --cuda \
               -v yolov1 \
               --weight path/to/weight \
               --img_size 640 \
                --root path/to/dataset/

You will get a coco_test-dev.json file. Then you should follow the official requirements to compress it into zip format and upload it the official evaluation server.

eeyrw/PyTorch_YOLO-Family

Update 2022-01-06

A new and strong YOLO family

Requirements

Come soon

This project

Future work

Weights

Google Drive

BaiDuYun Disk

Experiments

Tricks

Visualization

YOLO-Nano

YOLO-Tiny

YOLO-TR

YOLOv1

YOLOv2

YOLOv3

YOLOv3 with SPP

YOLOv3 with Dilated Encoder

YOLOv4

YOLOv4-exp

Dataset

VOC Dataset

Download VOC2007 trainval & test

Download VOC2012 trainval

MSCOCO Dataset

Download MSCOCO 2017 dataset

Train

Test

Evaluation

Evaluation on COCO-test-dev