Most of the DNN object detection algorithm can:
- classifies object
- localize object (find the coordinates of the bounding box enclosing the object)
Features thats counts for the modern Object Detection Algorithms are following:
- Accuracy (AP - Average Precission)
- Speed (FPS -Frames Per Second)
- Yolo v1 (2016) Joseph Redmon ‘You Only Look Once: Unified, Real-Time Object Detection’
- Yolo v2 (2017) Joseph Redmon ‘YOLO9000: Better, Faster, Stronger’
- Yolo v3 (2018) Joseph Redmon ‘YOLOv3: An Incremental Improvement’
- Yolo v4 (2020) Alexey Bochkovskiy ‘YOLOv4: Optimal Speed and Accuracy of Object Detection’. AP increase by 10% and FPS by 12% compared to v3
- Yolo v5 (2020) Glen Jocher - PyTorch implementation (v1 to v4 Darknet implementation). The major improvements includes mosaic data augmentation and auto learning bounding box anchors
- PP-Yolo (2020) Xiang Long et al.(Baidu) ‘PP-YOLO: An Effective and Efficient Implementation of Object Detector’. PP-YOLO is based on v3 model with replacement of Darknet 53 backbone of YOLO v3 with a ResNet backbone and increase of training batch size from 64 to 192. Improved mAP to 45.2% (from 43.5 for v4) and FPS from 65 to 73 for Tesla V100 (batch size = 1). Based on PaddlePaddle DL framework
- Yolo Z (2021) Aduen Benjumea et al. ‘YOLO-Z: Improving small object detection in YOLOv5 for autonomous vehicles’
- Yolo-ReT (2021) Prakhar Ganesh et al. ‘YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs’
- Scaled-Yolo v4 (2021) Chien-Yao Wang et al. 'Scaled-YOLOv4: Scaling Cross Stage Partial Network'
- YoloX (2021) Zheng Ge at all. ‘YOLOX: Exceeding YOLO Series in 2021’. Good for edge devices. YOLOX-Tiny and YOLOX-Nano outperform YOLOv4-Tiny and NanoDet offering a boost of 10.1% and 1.8% respectively
- YoloR (You Only Learn One) (2021) Chien-Yao Wang et al. ‘You Only Learn One Representation: Unified Network for Multiple Tasks’
- YoloS (2021) Yuxin Fang at all. 'You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection'
- YoloF (2021) Qiang Chen at all. 'You Only Look One-level Feature'
- YoloP (2022-v7) Dong Wu at all. ‘YOLOP: You Only Look Once for Panoptic Driving Perception’. YoloP was designed to perform three visual perception tasks: traffic object detection, drivable area segmentation and lane detection simultaneously in real-time on an embedded device (Jetson TX2, 23 FPS). It is based on one encoder for feature extraction and three decoders to handle the specific tasks
- Yolov6 (2022) Hardware-friendly design for backbone and neck, efficient Decoupled Head with SIoU Loss,
- Yolov7 not official (2022) A simple and standard training framework with Transformers for any detection && instance segmentation tasks, based on detectron2,
- Yolov7 official (2022) Chien-Yao Wang at all. 'Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors' YOLOv7 currently outperforms all known real-time object detectors with 30 FPS or higher on GPU V100. YOLOv7-E6 object detector (56 FPS V100, 55.9% AP)
- Real-Time Object Detection on COCO - Mean Average Precission (MAP) - YOLOv7-E6E
- Real-Time Object Detection on COCO - Speed FPS - YOLOv7-tiny-SiLU