/FreeYOLOv2

Primary LanguagePythonApache License 2.0Apache-2.0

FreeYOLOv2

English | 简体中文

Requirements

  • We recommend you to use Anaconda to create a conda environment:
conda create -n yolo python=3.6
  • Then, activate the environment:
conda activate yolo
  • Requirements:
pip install -r requirements.txt 

My environment:

  • PyTorch = 1.9.1
  • Torchvision = 0.10.1

At least, please make sure your torch is version 1.x.

Tricks

Training Configuration

Configuration
Per GPU Batch Size 16 (8 for FreeYOLOv2-Huge)
Init Lr 0.01
Warmup Scheduler Linear
Lr Scheduler Linear
Optimizer SGD
Multi Scale Train True
Mosaic True
Mixup True

Experiments

COCO

  • Download COCO.
cd <FreeYOLOv2_HOME>
cd dataset/scripts/
sh COCO2017.sh
  • Check COCO
cd <FreeYOLOv2_HOME>
python dataset/coco.py
  • Train on COCO

For example:

python train.py --cuda -d coco -m yolo_free_v2_nano -bs 16 --max_epoch 300 --wp_epoch 3 --eval_epoch 10 --fp16 --ema --root path/to/COCO

P5-Model on COCO:

  • FreeYOLOv2
Model Scale Epoch FPS3090
FP32-bs1
APval
0.5:0.95
FLOPs
(G)
Params
(M)
Weight
FreeYOLOv2-Pico 640 300 - 32.1 4.5 1.4 ckpt
FreeYOLOv2-Nano 640 300 100 35.3 9.0 2.8 ckpt
FreeYOLOv2-Small 640 300 86 43.3 33.3 8.8 ckpt
FreeYOLOv2-Medium 640 300 71 47.5 86.7 23.0 ckpt
FreeYOLOv2-Large 640 300 66 49.4 175.4 46.5 ckpt
  • All FLOPs are measured with a 640x640 image size on COCO val2017. The FPS is measured with batch size 1 on 3090 GPU from the model inference to the NMS operation.

  • A large batch size such as 128 or 256(I have to set it as 16.) may further improve the performance of FreeYOLOv2-Medium and FreeYOLOv2-Large.

  • Increasing Epoch to 500 may further improve the performance of FreeYOLOv2.

  • Due to my performance limitations, I am unable to train the FreeYOLOv2-Huge model.

WiderFace

WiderFace
|_ WIDER_train
|  |_ images
|     |_ 0--Parade
|     |_ ...
|_ WIDER_tval
|  |_ images
|     |_ 0--Parade
|     |_ ...
|_ wider_face_split
|_ eval_tools
  • Convert WiderFace to COCO format.
cd <FreeYOLOv2_HOME>
python tools/convert_widerface_to_coco.py --root path/to/WiderFace
  • Check WiderFace
cd <FreeYOLOv2_HOME>
python dataset/widerface.py
  • Train on WiderFace For example:
python train.py --cuda -d widerface --root path/to/WiderFace -m yolo_free_v2_nano -bs 16 --max_epoch 100 --wp_epoch 1 --eval_epoch 10 --fp16 --ema --pretrained path/to/coco/yolo_free_v2_nano_coco.pth --mosaic 0.5 --mixup 0.0 --min_box_size 1

Main results on WiderFace-val:

Model Scale AP AP50 Weight
FreeYOLOv2-Pico 640 29.9 55.1 ckpt
FreeYOLOv2-Nano 640 31.3 56.9 ckpt
FreeYOLOv2-Small 640 33.5 60.1 ckpt
FreeYOLOv2-Medium 640 34.8 61.7 ckpt
FreeYOLOv2-Large 640 35.5 62.6 ckpt

CrowdHuman

CrowdHuman
|_ CrowdHuman_train01.zip
|_ CrowdHuman_train02.zip
|_ CrowdHuman_train03.zip
|_ CrowdHuman_val.zip
|_ annotation_train.odgt
|_ annotation_val.odgt
  • Prepare CrowdHuman
CrowdHuman
|_ CrowdHuman_train
|  |_ Images
|     |_ 273271,1a0d6000b9e1f5b7.jpg
|     |_ ...
|_ CrowdHuman_val
|  |_ Images
|     |_ 273271,1b9330008da38cd6.jpg
|     |_ ...
|_ annotation_train.odgt
|_ annotation_val.odgt
  • Convert CrowdHuman to COCO format.
cd <FreeYOLOv2_HOME>
python tools/convert_crowdhuman_to_coco.py --root path/to/CrowdHuman
  • Check CrowdHuman
cd <FreeYOLOv2_HOME>
python dataset/crowdhuman.py
  • Train on CrowdHuman

For example:

python train.py --cuda -d crowdhuman -m yolo_free_v2_nano -bs 16 --max_epoch 100 --wp_epoch 1 --eval_epoch 10 --fp16 --ema --root path/to/CrowdHuman --pretrained path/to/coco/yolo_free_v2_nano.pth

Main results on CrowdHuman-val:

Model Scale AP MR JI Weight
FreeYOLOv2-Pico 640 82.4 53.1 69.8 ckpt
FreeYOLOv2-Nano 640 84.1 50.3 71.8 ckpt
FreeYOLOv2-Small 640 86.5 45.6 75.0 ckpt
FreeYOLOv2-Medium 640 87.5 43.4 76.2 ckpt
FreeYOLOv2-Large 640 88.2 42.1 76.8 ckpt

Train

Single GPU

sh train.sh

You can change the configurations of train.sh, according to your own situation.

You also can add --vis_tgt to check the images and targets during the training stage. For example:

python train.py --cuda -d coco --root path/to/coco -m yolo_free_v2_large --vis_tgt

image image image image

Multi GPUs

sh train_ddp.sh

You can change the configurations of train_ddp.sh, according to your own situation.

In the event of a training interruption, you can pass --resume the latest training weight path (None by default) to resume training. For example:

python train.py \
        --cuda \
        -d coco \
        -m yolo_free_v2_large \
        -bs 16 \
        --max_epoch 300 \
        --wp_epoch 3 \
        --eval_epoch 10 \
        --ema \
        --fp16 \
        --resume weights/coco/yolo_free_v2_large/yolo_free_v2_large_epoch_151_39.24.pth

Then, training will continue from 151 epoch.

Test

python test.py -d coco \
               --cuda \
               -m yolo_free_v2_large \
               --img_size 640 \
               --weight path/to/weight \
               --root path/to/dataset/ \
               --show

Evaluation

python eval.py -d coco-val \
               --cuda \
               -m yolo_free_v2_large \
               --img_size 640 \
               --weight path/to/weight \
               --root path/to/dataset/ \
               --show

Demo

I have provide some images in data/demo/images/, so you can run following command to run a demo:

python demo.py --mode image \
               --path_to_img data/demo/images/ \
               -m yolo_free_v2_large \
               --img_size 640 \
               -nc 80 \  # number of classes
               --cuda \
               --weight path/to/weight

If you want run a demo of streaming video detection, you need to set --mode to video, and give the path to video --path_to_vid

python demo.py --mode video \
               --path_to_img data/demo/videos/your_video \
               -m yolo_free_v2_large \
               --img_size 640 \
               -nc 80 \  # number of classes
               --cuda \
               --weight path/to/weight

If you want run video detection with your camera, you need to set --mode to camera

python demo.py --mode camera \
               -m yolo_free_v2_large \
               --img_size 640 \
               -nc 80 \  # number of classes
               --cuda \
               --weight path/to/weight

Train on ourself dataset

Besides the popular datasets, we can also train the model on ourself dataset. To achieve this goal, you should follow these steps:

  • Step-1: Prepare the images (JPG/JPEG/PNG ...) and use labelimg to make XML format annotation files.
OurDataset
|_ train
|  |_ images     
|     |_ 0.jpg
|     |_ 1.jpg
|     |_ ...
|  |_ annotations
|     |_ 0.xml
|     |_ 1.xml
|     |_ ...
|_ val
|  |_ images     
|     |_ 0.jpg
|     |_ 1.jpg
|     |_ ...
|  |_ annotations
|     |_ 0.xml
|     |_ 1.xml
|     |_ ...
|  ...
  • Step-2: Convert ourdataset to COCO format.
cd <FreeYOLOv2_HOME>
cd tools
# convert train split
python convert_ours_to_coco.py --root path/to/Dataset/ --split train
# convert val split
python convert_ours_to_coco.py --root path/to/Dataset/ --split val

Then, we can get a train.json file and a val.json file, as shown below.

Dataset
|_ train
|  |_ images     
|     |_ 0.jpg
|     |_ 1.jpg
|     |_ ...
|  |_ annotations
|     |_ 0.xml
|     |_ 1.xml
|     |_ ...
|     |_ train.json
|_ val
|  |_ images     
|     |_ 0.jpg
|     |_ 1.jpg
|     |_ ...
|  |_ annotations
|     |_ 0.xml
|     |_ 1.xml
|     |_ ...
|     |_ val.json
|  ...
  • Step-3 Define our class labels.

Please open dataset/ourdataset.py file and change our_class_labels = ('cat',) according to our definition of categories.

  • Step-4 Check
cd <FreeYOLOv2_HOME>
cd dataset
# convert train split
python ourdataset.py --root path/to/Dataset/ --split train
# convert val split
python ourdataset.py --root path/to/Dataset/ --split val
  • Step-5 Train

For example:

cd <FreeYOLOv2_HOME>
python train.py --root path/to/Dataset/ -d ourdataset -m yolo_free_v2_nano -bs 16 --max_epoch 100 --wp_epoch 1 --eval_epoch 5 -p path/to/yolo_free_tiny_coco.pth
  • Step-6 Test

For example:

cd <FreeYOLOv2_HOME>
python test.py --root path/to/Dataset/ -d ourdataset -m yolo_free_v2_nano --weight path/to/checkpoint --show
  • Step-7 Eval

For example:

cd <FreeYOLOv2_HOME>
python eval.py --root path/to/Dataset/ -d ourdataset -m yolo_free_v2_nano --weight path/to/checkpoint

Tracking

Our project also supports multi-object tracking tasks. We use the YOLO of this project as the detector, following the "tracking-by-detection" framework, and use the simple and efficient ByteTrack as the tracker.

  • images tracking
python track.py --mode image \
                --path_to_img path/to/images/ \
                --cuda \
                -size 640 \
                -dt yolo_free_v2_nano \
                -tk byte_tracker \
                --weight path/to/coco_pretrained/ \
                --show \
                --gif
  • video tracking
python track.py --mode video \
                --path_to_img path/to/video/ \
                --cuda \
                -size 640 \
                -dt yolo_free_v2_nano \
                -tk byte_tracker \
                --weight path/to/coco_pretrained/ \
                --show \
                --gif
  • camera tracking
python track.py --mode camera \
                --cuda \
                -size 640 \
                -dt yolo_free_v2_nano \
                -tk byte_tracker \
                --weight path/to/coco_pretrained/ \
                --show \
                --gif

Tracking visualization

  • Detector: FreeYOLOv2-Nano (pretrained on COCO)
  • Tracker: ByteTracker
  • Device: i5-12500H CPU

Command:

python track.py --mode video \
                --path_to_img ./dataset/demo/videos/000006.mp4 \
                -size 640 \
                -dt yolo_free_v2_nano \
                -tk byte_tracker \
                --weight path/to/coco_pretrained/ \
                --show \
                --gif

If GPU can be used, please include the --cuda parameter in the above command to use GPU for detection.

Results:

image

Deployment

  1. ONNX export and an ONNXRuntime
  2. OpenVINO in C++ and Python