Self-driving training with YOLO

Demo
Dataset
AP
mAP
Classes
Training Log
Weights
How TO Use
Environment
Speed
YouTube Link of Video Demo
Google Colab
Tutorial
- Run your custom object detection
Citation
References

Demo

YOLOv4	YOLOv3

Dataset

AP

class	AP in YOLOv4	AP in YOLOv3	TP&FP in YOLOv4	TP&FP in YOLOv3
car	ap = 73.09%	ap = 69.30%	TP = 15977, FP = 5767	TP = 15037, FP = 6829
truck	ap = 61.61%	ap = 51.89%	TP = 573, FP = 232	TP = 469, FP = 244
pedestrian	ap = 42.53%	ap = 24.20%	TP = 2192, FP = 1392	TP = 1213, FP = 1242
bicyclist	ap = 41.32%	ap = 15.66%	TP = 93, FP = 63	TP = 51, FP = 94
light	ap = 51.58%	ap = 42.93%	TP = 2298, FP = 739	TP = 1793, FP = 706

Conclusion: More significant improvement in low AP classes.

mAP

for 10,000 iterations

YOLOv4	YOLOv3
mean average precision (mAP@0.50) = 54.02 %	mean average precision (mAP@0.50) = 40.80 %

Classes

car: with 101314 labels
truck: with 6313 labels
pedestrian: with 10637 labels
bicyclist: with 1442 labels
light: with 12700 labels

Training Log

YOLOv4	YOLOv3

Conclusion: The speed of convergence in YOLOv4 is much faster than that in YOLOv3

Weights

YOLOv4	YOLOv3
yolov4-obj_10000.weights	yolov3-obj_10000.weights

How To Use

Use with YOLOv4 AlexeyAB

Environment

VM: Google Colaboratory
GPU: NVIDIA T4 Tensor GPU
NVIDIA-SMI 470.42.01 Driver Version: 460.32.03 CUDA Version: 11.2
nvcc: NVIDIA (R) Cuda compiler driver
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.TC445_37.28845127_0

Speed

	YOLOv4	YOLOv3
Quality: 1080p	AVG FPS = 14.6	AVG FPS = 16.1
Quality: 720p	AVG FPS = 33.1	AVG FPS = 33.0
Quality: 360p	AVG FPS = 45.2	AVG FPS = 43.4
mAP	54.02 %	40.80 %

YouTube Link of Video Demo

	YOLOv4	YOLOv3
Quality: 1080p	Click Me	Click Me
Quality: 720p	Click Me	Click Me
Quality: 360p	Click Me	Click Me
mAP	54.02 %	40.80 %

Video of comparison between YOLOv4 & YOLOv3

Google Colab

Link

Tutorial

檢查環境用指令

verify CUDA version: /usr/local/cuda/bin/nvcc --version
check GPU info: nvidia-smi

配置Darknet環境

clone AlexeyAB/darknet repo

git clone https://github.com/AlexeyAB/darknet

change makefile to have GPU and OPENCV enabled

sed -i 's/GPU=0/GPU=1/' Makefile
sed -i 's/CUDNN=0/CUDNN=1/' Makefile
sed -i 's/OPENCV=0/OPENCV=1/' Makefile

Build darknet environment

make

配置config檔

change line batch to batch=64
change line subdivisions to subdivisions=16
change line max_batches to (classes*2000, but not less than number of training images and not less than 6000), f.e. max_batches=6000 if you train for 3 classes
change line steps to 80% and 90% of max_batches, f.e. steps=4800,5400
set network size width=416 height=416 or any value multiple of 32:
change line classes=80 to your number of objects in each of 3 [yolo]-layers
change [filters=255] to filters=(classes + 5)x3 in the 3 [convolutional] before each [yolo] layer, keep in mind that it only has to be the last [convolutional] before each of the [yolo] layers

準備好以下檔案

train.txt
test.txt
obj.data
obj.names
pre-trained.weights

Start training

./darknet detector train <your_path_of_obj.data> <your_path_of_cfg> <your_path_of_weights> -chart chart.png

-chart chart.png: 可保存訓練過程

Run your custom object detection

修改cfg

sed -i 's/batch=64/batch=1/' <your_path_of_cfg>
sed -i 's/subdivisions=16/subdivisions=1/' <your_path_of_cfg>

Detect

./darknet detector test <your_path_of_obj.data> <your_path_of_cfg> <your_path_of_weights> <your_path_of_input_picture>

計算mAP

./darknet detector map <your_path_of_obj.data> <your_path_of_cfg> <your_path_of_weights>

Citation

@misc{bochkovskiy2020yolov4,
      title={YOLOv4: Optimal Speed and Accuracy of Object Detection}, 
      author={Alexey Bochkovskiy and Chien-Yao Wang and Hong-Yuan Mark Liao},
      year={2020},
      eprint={2004.10934},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

@InProceedings{Wang_2021_CVPR,
    author    = {Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark},
    title     = {{Scaled-YOLOv4}: Scaling Cross Stage Partial Network},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {13029-13038}
}

References

YOLOv4: Optimal Speed and Accuracy of Object Detection: https://arxiv.org/pdf/2004.10934
Training data from: https://www.kaggle.com/alincijov/self-driving-cars
Tesing data from: https://youtu.be/z1obnaqPgMA

dec880126/Self-driving-with-YOLO