I have always used batchsize to scale loss:
loss = loss.sum() / batch_size
However, recently, I think it is not optimal. To address this issue, I scale loss by the number of total positive samples:
loss = loss.sum() / num_pos
After this optimization, I find some tricks which I used but not work can work now. Therefore, I am trying those tricks to make my YOLO better. Once I complete these optimizations, I will immediately upload the latest weight files.
Recently, I rebuild my YOLO-Family project !!
- We recommend you to use Anaconda to create a conda environment:
conda create -n yolo python=3.6
- Then, activate the environment:
conda activate yolo
- Requirements:
pip install -r requirements.txt
PyTorch >= 1.1.0 and Torchvision >= 0.3.0
My better YOLO family
In this project, you can enjoy:
- a new and stronger YOLOv1
- a new and stronger YOLOv2
- YOLOv3 with DilatedEncoder
- YOLOv4 ~ (I'm try to make it better)
- YOLO-Tiny
- YOLO-Nano
- Try to make my YOLO-v4 better.
- Train my YOLOv1/YOLOv2 with ViT-Base (pretrained by MaskAutoencoder)
You can download all weights including my DarkNet-53, CSPDarkNet-53, MAE-ViT and YOLO weights from the following links.
Link: Hold on ...
Link:https://pan.baidu.com/s/1Cin9R52wfubD4xZUHHCRjg
Password:aigz
Tricks in this project:
- Augmentations: Flip + Color jitter + RandomCrop + Multi-scale
- Model EMA
- GIoU
- Mosaic Augmentation for my YOLOv4
- Multiple positive samples for my YOLOv4
On the COCO-val:
Backbone | Size | FPS | AP | AP50 | AP75 | APs | APm | APl | GFLOPs | Params | |
YOLO-Nano | ShuffleNetv2-1.0x | 512 | 21.6 | 40.0 | 20.5 | 7.4 | 22.7 | 32.3 | 1.65 | 1.86M | |
YOLO-Tiny | CSPDarkNet-Tiny | 512 | 26.6 | 46.1 | 26.9 | 13.5 | 30.0 | 35.0 | 5.52 | 7.66M | |
YOLO-TR | ViT-B | 384 | |||||||||
YOLOv1 | ResNet50 | 640 | 35.2 | 54.7 | 37.1 | 14.3 | 39.5 | 53.4 | 41.96 | 44.54M | |
YOLOv2 | ResNet50 | 640 | 36.3 | 56.6 | 37.7 | 15.1 | 41.1 | 54.0 | 42.10 | 44.89M | |
YOLOv3-DE | DarkNet53 | 640 | 38.7 | 60.2 | 40.7 | 21.3 | 41.7 | 51.7 | 76.41 | 57.25M | |
YOLOv4 | CSPDarkNet53 | 640 | 40.5 | 60.4 | 43.5 | 24.2 | 44.8 | 52.0 | 60.55 | 52.00M |
The FPS of all YOLO detectors are measured on a one 2080ti GPU with 640 × 640 size.
I will upload some visualization results:
FPS | AP | AP50 | AP75 | APs | APm | APl | GFLOPs | Params | |
YOLO-Nano-320 | 17.2 | 32.9 | 15.8 | 3.5 | 15.7 | 31.4 | 0.64 | 1.86M | |
YOLO-Nano-416 | 20.2 | 37.7 | 19.3 | 5.5 | 19.7 | 33.5 | 1.09 | 1.86M | |
YOLO-Nano-512 | 21.6 | 40.0 | 20.5 | 7.4 | 22.7 | 32.3 | 1.65 | 1.86M |
FPS | AP | AP50 | AP75 | APs | APm | APl | GFLOPs | Params | |
YOLO-Tiny-320 | 24.5 | 42.4 | 24.5 | 8.9 | 26.1 | 38.8 | 2.16 | 7.66M | |
YOLO-Tiny-416 | 25.7 | 44.4 | 25.9 | 11.7 | 27.8 | 36.7 | 3.64 | 7.66M | |
YOLO-Tiny-512 | 26.6 | 46.1 | 26.9 | 13.5 | 30.0 | 35.0 | 5.52 | 7.66M |
FPS | AP | AP50 | AP75 | APs | APm | APl | |
YOLO-TR-224 | |||||||
YOLO-TR-320 | |||||||
YOLO-TR-384 |
FPS | AP | AP50 | AP75 | APs | APm | APl | |
YOLOv1-320 | 25.4 | 41.5 | 26.0 | 4.2 | 25.0 | 49.8 | |
YOLOv1-416 | 30.1 | 47.8 | 30.9 | 7.8 | 31.9 | 53.3 | |
YOLOv1-512 | 33.1 | 52.2 | 34.0 | 10.8 | 35.9 | 54.9 | |
YOLOv1-640 | 35.2 | 54.7 | 37.1 | 14.3 | 39.5 | 53.4 |
FPS | AP | AP50 | AP75 | APs | APm | APl | |
YOLOv2-320 | 26.8 | 44.1 | 27.1 | 4.7 | 27.6 | 50.8 | |
YOLOv2-416 | 31.6 | 50.3 | 32.4 | 9.1 | 33.8 | 54.0 | |
YOLOv2-512 | 34.3 | 54.0 | 35.4 | 12.3 | 37.8 | 55.2 | |
YOLOv2-640 | 36.3 | 56.6 | 37.7 | 15.1 | 41.1 | 54.0 |
Coming soon.
FPS | AP | AP50 | AP75 | APs | APm | APl | |
YOLOv3-320 | |||||||
YOLOv3-416 | |||||||
YOLOv3-512 | |||||||
YOLOv3-608 | |||||||
YOLOv3-640 |
Coming soon.
FPS | AP | AP50 | AP75 | APs | APm | APl | |
YOLOv3-SPP-320 | |||||||
YOLOv3-SPP-416 | |||||||
YOLOv3-SPP-512 | |||||||
YOLOv3-SPP-608 | |||||||
YOLOv3-SPP-640 |
The DilatedEncoder is proposed by YOLOF.
FPS | AP | AP50 | AP75 | APs | APm | APl | |
YOLOv3-320 | 31.1 | 51.1 | 31.7 | 10.2 | 32.6 | 51.2 | |
YOLOv3-416 | 35.0 | 56.1 | 36.3 | 14.6 | 37.4 | 53.7 | |
YOLOv3-512 | 37.7 | 59.3 | 39.6 | 17.9 | 40.4 | 54.4 | |
YOLOv3-640 | 38.7 | 60.2 | 40.7 | 21.3 | 41.7 | 51.7 |
Coming soon.
FPS | AP | AP50 | AP75 | APs | APm | APl | |
YOLOv4-SPP-320 | |||||||
YOLOv4-SPP-416 | |||||||
YOLOv4-SPP-512 | |||||||
YOLOv4-SPP-608 | |||||||
YOLOv4-SPP-640 |
This is an experimental model. I am currently further optimizing my YOLOv4, using better CSPDarkNet and better training strategies.
FPS | AP | AP50 | AP75 | APs | APm | APl | |
YOLOv4-320 | 36.7 | 55.4 | 38.2 | 15.7 | 39.9 | 57.5 | |
YOLOv4-416 | 39.2 | 58.6 | 41.4 | 20.1 | 43.3 | 56.8 | |
YOLOv4-512 | 40.5 | 60.1 | 43.1 | 22.8 | 44.5 | 56.1 | |
YOLOv4-640 | 40.5 | 60.4 | 43.5 | 24.2 | 44.8 | 52.0 |
I copy the download files from the following excellent project: https://github.com/amdegroot/ssd.pytorch
I have uploaded the VOC2007 and VOC2012 to BaiDuYunDisk, so for researchers in China, you can download them from BaiDuYunDisk:
Link:https://pan.baidu.com/s/1tYPGCYGyC0wjpC97H-zzMQ
Password:4la9
You will get a VOCdevkit.zip
, then what you need to do is just to unzip it and put it into data/
. After that, the whole path to VOC dataset is data/VOCdevkit/VOC2007
and data/VOCdevkit/VOC2012
.
# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2007.sh # <directory>
# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2012.sh # <directory>
Just run sh data/scripts/COCO2017.sh
. You will get COCO train2017, val2017, test2017.
For example:
python train.py --cuda \
-d coco \
-v yolov1 \
-ms \
--ema \
--batch_size 16 \
--root path/to/dataset/
You can run python train.py -h
to check all optional argument. Or you can just run the shell file, for example:
sh train_yolov1.sh
If you have multi gpus like 8, and you put 4 images on each gpu:
python -m torch.distributed.launch --nproc_per_node=8 train.py -d coco \
--cuda \
-v yolov1 \
-ms \
--ema \
-dist \
--sybn \
--num_gpu 8 \
--batch_size 4 \
--root path/to/dataset/
Attention, --batch_size
is the number of batchsize on per GPU, not all GPUs.
I have upload all training log files. For example, 1-v1.txt
contains all the output information during the training YOLOv1.
It is strongly recommended that you open the training shell file to check how I train each YOLO detector.
For example:
python test.py -d coco \
--cuda \
-v yolov1 \
--weight path/to/weight \
--img_size 640 \
--root path/to/dataset/ \
--show
For example
python eval.py -d coco-val \
--cuda \
-v yolov1 \
--weight path/to/weight \
--img_size 640 \
--root path/to/dataset/
To run on COCO_test-dev(You must be sure that you have downloaded test2017):
python eval.py -d coco-test \
--cuda \
-v yolov1 \
--weight path/to/weight \
--img_size 640 \
--root path/to/dataset/
You will get a coco_test-dev.json
file.
Then you should follow the official requirements to compress it into zip format
and upload it the official evaluation server.