FreeYOLOv2

Requirements

We recommend you to use Anaconda to create a conda environment:

conda create -n yolo python=3.6

Then, activate the environment:

conda activate yolo

Requirements:

pip install -r requirements.txt

My environment:

PyTorch = 1.9.1
Torchvision = 0.10.1

At least, please make sure your torch is version 1.x.

Tricks

Mosaic Augmentation
Mixup Augmentation
Multi scale training
Cosine Annealing Schedule

Training Configuration

Configuration
Per GPU Batch Size	16 (8 for FreeYOLOv2-Huge)
Init Lr	0.01
Warmup Scheduler	Linear
Lr Scheduler	Linear
Optimizer	SGD
Multi Scale Train	True
Mosaic	True
Mixup	True

Experiments

COCO

Download COCO.

cd <FreeYOLOv2_HOME>
cd dataset/scripts/
sh COCO2017.sh

Check COCO

cd <FreeYOLOv2_HOME>
python dataset/coco.py

Train on COCO

For example:

python train.py --cuda -d coco -m yolo_free_v2_nano -bs 16 --max_epoch 300 --wp_epoch 3 --eval_epoch 10 --fp16 --ema --root path/to/COCO

P5-Model on COCO:

FreeYOLOv2

Model	Scale	Epoch	FPS^{3090 FP32-bs1}	AP^val 0.5:0.95	FLOPs ^(G)	Params ^(M)	Weight
FreeYOLOv2-Pico	640	300	-	32.1	4.5	1.4	ckpt
FreeYOLOv2-Nano	640	300	100	35.3	9.0	2.8	ckpt
FreeYOLOv2-Small	640	300	86	43.3	33.3	8.8	ckpt
FreeYOLOv2-Medium	640	300	71	47.5	86.7	23.0	ckpt
FreeYOLOv2-Large	640	300	66	49.4	175.4	46.5	ckpt

All FLOPs are measured with a 640x640 image size on COCO val2017. The FPS is measured with batch size 1 on 3090 GPU from the model inference to the NMS operation.
A large batch size such as 128 or 256（I have to set it as 16.） may further improve the performance of FreeYOLOv2-Medium and FreeYOLOv2-Large.
Increasing Epoch to 500 may further improve the performance of FreeYOLOv2.
Due to my performance limitations, I am unable to train the FreeYOLOv2-Huge model.

WiderFace

Download WiderFace.
Prepare WiderFace

WiderFace
|_ WIDER_train
|  |_ images
|     |_ 0--Parade
|     |_ ...
|_ WIDER_tval
|  |_ images
|     |_ 0--Parade
|     |_ ...
|_ wider_face_split
|_ eval_tools

Convert WiderFace to COCO format.

cd <FreeYOLOv2_HOME>
python tools/convert_widerface_to_coco.py --root path/to/WiderFace

Check WiderFace

cd <FreeYOLOv2_HOME>
python dataset/widerface.py

Train on WiderFace For example:

python train.py --cuda -d widerface --root path/to/WiderFace -m yolo_free_v2_nano -bs 16 --max_epoch 100 --wp_epoch 1 --eval_epoch 10 --fp16 --ema --pretrained path/to/coco/yolo_free_v2_nano_coco.pth --mosaic 0.5 --mixup 0.0 --min_box_size 1

Main results on WiderFace-val:

Model	Scale	AP	AP50	Weight
FreeYOLOv2-Pico	640	29.9	55.1	ckpt
FreeYOLOv2-Nano	640	31.3	56.9	ckpt
FreeYOLOv2-Small	640	33.5	60.1	ckpt
FreeYOLOv2-Medium	640	34.8	61.7	ckpt
FreeYOLOv2-Large	640	35.5	62.6	ckpt

CrowdHuman

Download CrowdHuman.

CrowdHuman
|_ CrowdHuman_train01.zip
|_ CrowdHuman_train02.zip
|_ CrowdHuman_train03.zip
|_ CrowdHuman_val.zip
|_ annotation_train.odgt
|_ annotation_val.odgt

Prepare CrowdHuman

CrowdHuman
|_ CrowdHuman_train
|  |_ Images
|     |_ 273271,1a0d6000b9e1f5b7.jpg
|     |_ ...
|_ CrowdHuman_val
|  |_ Images
|     |_ 273271,1b9330008da38cd6.jpg
|     |_ ...
|_ annotation_train.odgt
|_ annotation_val.odgt

Convert CrowdHuman to COCO format.

cd <FreeYOLOv2_HOME>
python tools/convert_crowdhuman_to_coco.py --root path/to/CrowdHuman

Check CrowdHuman

cd <FreeYOLOv2_HOME>
python dataset/crowdhuman.py

Train on CrowdHuman

For example:

python train.py --cuda -d crowdhuman -m yolo_free_v2_nano -bs 16 --max_epoch 100 --wp_epoch 1 --eval_epoch 10 --fp16 --ema --root path/to/CrowdHuman --pretrained path/to/coco/yolo_free_v2_nano.pth

Main results on CrowdHuman-val:

Model	Scale	AP	MR	JI	Weight
FreeYOLOv2-Pico	640	82.4	53.1	69.8	ckpt
FreeYOLOv2-Nano	640	84.1	50.3	71.8	ckpt
FreeYOLOv2-Small	640	86.5	45.6	75.0	ckpt
FreeYOLOv2-Medium	640	87.5	43.4	76.2	ckpt
FreeYOLOv2-Large	640	88.2	42.1	76.8	ckpt

Train

Single GPU

sh train.sh

You can change the configurations of train.sh, according to your own situation.

You also can add --vis_tgt to check the images and targets during the training stage. For example:

python train.py --cuda -d coco --root path/to/coco -m yolo_free_v2_large --vis_tgt

Multi GPUs

sh train_ddp.sh

You can change the configurations of train_ddp.sh, according to your own situation.

In the event of a training interruption, you can pass --resume the latest training weight path (None by default) to resume training. For example:

python train.py \
        --cuda \
        -d coco \
        -m yolo_free_v2_large \
        -bs 16 \
        --max_epoch 300 \
        --wp_epoch 3 \
        --eval_epoch 10 \
        --ema \
        --fp16 \
        --resume weights/coco/yolo_free_v2_large/yolo_free_v2_large_epoch_151_39.24.pth

Then, training will continue from 151 epoch.

Test

python test.py -d coco \
               --cuda \
               -m yolo_free_v2_large \
               --img_size 640 \
               --weight path/to/weight \
               --root path/to/dataset/ \
               --show

Evaluation

python eval.py -d coco-val \
               --cuda \
               -m yolo_free_v2_large \
               --img_size 640 \
               --weight path/to/weight \
               --root path/to/dataset/ \
               --show

Demo

I have provide some images in data/demo/images/, so you can run following command to run a demo:

python demo.py --mode image \
               --path_to_img data/demo/images/ \
               -m yolo_free_v2_large \
               --img_size 640 \
               -nc 80 \  # number of classes
               --cuda \
               --weight path/to/weight

If you want run a demo of streaming video detection, you need to set --mode to video, and give the path to video --path_to_vid。

python demo.py --mode video \
               --path_to_img data/demo/videos/your_video \
               -m yolo_free_v2_large \
               --img_size 640 \
               -nc 80 \  # number of classes
               --cuda \
               --weight path/to/weight

If you want run video detection with your camera, you need to set --mode to camera。

python demo.py --mode camera \
               -m yolo_free_v2_large \
               --img_size 640 \
               -nc 80 \  # number of classes
               --cuda \
               --weight path/to/weight

Train on ourself dataset

Besides the popular datasets, we can also train the model on ourself dataset. To achieve this goal, you should follow these steps:

Step-1: Prepare the images (JPG/JPEG/PNG ...) and use labelimg to make XML format annotation files.

OurDataset
|_ train
|  |_ images     
|     |_ 0.jpg
|     |_ 1.jpg
|     |_ ...
|  |_ annotations
|     |_ 0.xml
|     |_ 1.xml
|     |_ ...
|_ val
|  |_ images     
|     |_ 0.jpg
|     |_ 1.jpg
|     |_ ...
|  |_ annotations
|     |_ 0.xml
|     |_ 1.xml
|     |_ ...
|  ...

Step-2: Convert ourdataset to COCO format.

cd <FreeYOLOv2_HOME>
cd tools
# convert train split
python convert_ours_to_coco.py --root path/to/Dataset/ --split train
# convert val split
python convert_ours_to_coco.py --root path/to/Dataset/ --split val

Then, we can get a train.json file and a val.json file, as shown below.

Dataset
|_ train
|  |_ images     
|     |_ 0.jpg
|     |_ 1.jpg
|     |_ ...
|  |_ annotations
|     |_ 0.xml
|     |_ 1.xml
|     |_ ...
|     |_ train.json
|_ val
|  |_ images     
|     |_ 0.jpg
|     |_ 1.jpg
|     |_ ...
|  |_ annotations
|     |_ 0.xml
|     |_ 1.xml
|     |_ ...
|     |_ val.json
|  ...

Step-3 Define our class labels.

Please open dataset/ourdataset.py file and change our_class_labels = ('cat',) according to our definition of categories.

Step-4 Check

cd <FreeYOLOv2_HOME>
cd dataset
# convert train split
python ourdataset.py --root path/to/Dataset/ --split train
# convert val split
python ourdataset.py --root path/to/Dataset/ --split val

Step-5 Train

For example:

cd <FreeYOLOv2_HOME>
python train.py --root path/to/Dataset/ -d ourdataset -m yolo_free_v2_nano -bs 16 --max_epoch 100 --wp_epoch 1 --eval_epoch 5 -p path/to/yolo_free_tiny_coco.pth

Step-6 Test

For example:

cd <FreeYOLOv2_HOME>
python test.py --root path/to/Dataset/ -d ourdataset -m yolo_free_v2_nano --weight path/to/checkpoint --show

Step-7 Eval

For example:

cd <FreeYOLOv2_HOME>
python eval.py --root path/to/Dataset/ -d ourdataset -m yolo_free_v2_nano --weight path/to/checkpoint

Tracking

Our project also supports multi-object tracking tasks. We use the YOLO of this project as the detector, following the "tracking-by-detection" framework, and use the simple and efficient ByteTrack as the tracker.

images tracking

python track.py --mode image \
                --path_to_img path/to/images/ \
                --cuda \
                -size 640 \
                -dt yolo_free_v2_nano \
                -tk byte_tracker \
                --weight path/to/coco_pretrained/ \
                --show \
                --gif

video tracking

python track.py --mode video \
                --path_to_img path/to/video/ \
                --cuda \
                -size 640 \
                -dt yolo_free_v2_nano \
                -tk byte_tracker \
                --weight path/to/coco_pretrained/ \
                --show \
                --gif

camera tracking

python track.py --mode camera \
                --cuda \
                -size 640 \
                -dt yolo_free_v2_nano \
                -tk byte_tracker \
                --weight path/to/coco_pretrained/ \
                --show \
                --gif

Tracking visualization

Detector: FreeYOLOv2-Nano (pretrained on COCO)
Tracker: ByteTracker
Device: i5-12500H CPU

Command：

python track.py --mode video \
                --path_to_img ./dataset/demo/videos/000006.mp4 \
                -size 640 \
                -dt yolo_free_v2_nano \
                -tk byte_tracker \
                --weight path/to/coco_pretrained/ \
                --show \
                --gif

If GPU can be used, please include the --cuda parameter in the above command to use GPU for detection.

Results:

yjh0410/FreeYOLOv2

FreeYOLOv2

Requirements

Tricks

Training Configuration

Experiments

COCO

WiderFace

CrowdHuman

Train

Single GPU

Multi GPUs

Test

Evaluation

Demo

Train on ourself dataset

Tracking

Tracking visualization

Deployment