/awesome-yolo-object-detection

🚀🚀🚀 A collection of some awesome public YOLO object detection series projects and the related object detection datasets.

Awesome-YOLO-Object-Detection

Awesome

🚀🚀🚀 YOLO is a great real-time one-stage object detection framework. This repository lists some awesome public YOLO object detection series projects and the related object detection datasets.

Contents

Summary

  • Famous YOLO

    • YOLOv1 (Darknet ) : "You Only Look Once: Unified, Real-Time Object Detection". (CVPR 2016)

    • YOLOv2 (Darknet ) : "YOLO9000: Better, Faster, Stronger". (CVPR 2017)

    • YOLOv3 (Darknet ) : "YOLOv3: An Incremental Improvement". (arXiv 2018)

    • YOLOv4 (WongKinYiu/PyTorch_YOLOv4 ) : "YOLOv4: Optimal Speed and Accuracy of Object Detection". (arXiv 2020)

    • Scaled-YOLOv4 (WongKinYiu/ScaledYOLOv4 ) : "Scaled-YOLOv4: Scaling Cross Stage Partial Network". (CVPR 2021)

    • YOLOv5 : YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite. docs.ultralytics.com. YOLOv5 🚀 is the world's most loved vision AI, representing Ultralytics open-source research into future vision AI methods, incorporating lessons learned and best practices evolved over thousands of hours of research and development.

    • YOLOv6 : "YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications". (arXiv 2022).

    • YOLOv7 : "YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors". (CVPR 2023).

    • YOLOv8 : NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite. docs.ultralytics.com

    • YOLOv9 : "YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information". (arXiv 2024)

    • MultimediaTechLab/YOLO : YOLO: Official Implementation of YOLOv9, YOLOv7, YOLO-RD. Welcome to the official implementation of YOLOv7 and YOLOv9, YOLO-RD. This repository will contains the complete codebase, pre-trained models, and detailed instructions for training and deploying YOLOv9.

    • YOLOv10 : "YOLOv10: Real-Time End-to-End Object Detection". (arXiv 2024)

    • YOLOv11 : NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite. Ultralytics YOLOv11 s a cutting-edge, state-of-the-art (SOTA) model that builds upon the success of previous YOLO versions and introduces new features and improvements to further boost performance and flexibility. YOLO11 is designed to be fast, accurate, and easy to use, making it an excellent choice for a wide range of object detection and tracking, instance segmentation, image classification and pose estimation tasks. docs.ultralytics.com

    • YOLOv12 : "YOLOv12: Attention-Centric Real-Time Object Detectors". (arXiv 2025)

    • YOLO-World : "YOLO-World: Real-Time Open-Vocabulary Object Detection". (CVPR 2024). www.yoloworld.cc

  • Extensional Frameworks

    • VLM-R1 : VLM-R1: A stable and generalizable R1-style Large Vision-Language Model. Solve Visual Understanding with Reinforced VLMs.

    • Florence-2 : "Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks". (CVPR 2024).

    • maestro : VLM fine-tuning for everyone. maestro is a streamlined tool to accelerate the fine-tuning of multimodal models. By encapsulating best practices from our core modules, maestro handles configuration, data loading, reproducibility, and training loop setup. It currently offers ready-to-use recipes for popular vision-language models such as Florence-2, PaliGemma 2, and Qwen2.5-VL. maestro.roboflow.com

    • Autodistill : Images to inference with no labeling (use foundation models to train supervised models). Autodistill uses big, slower foundation models to train small, faster supervised models. Using autodistill, you can go from unlabeled images to inference on a custom model running at the edge with no human intervention in between. docs.autodistill.com

    • EdgeYOLO : an edge-real-time anchor-free object detector with decent performance. "Edge YOLO: Real-time intelligent object detection system based on edge-cloud cooperation in autonomous vehicles". (IEEE Transactions on Intelligent Transportation Systems, 2022). "EdgeYOLO: An Edge-Real-Time Object Detector". (arXiv 2023)

    • YOLOX : "YOLOX: Exceeding YOLO Series in 2021". (arXiv 2021)

    • YOLOR : "You Only Learn One Representation: Unified Network for Multiple Tasks". (arXiv 2021)

    • YOLOF : "You Only Look One-level Feature". (CVPR 2021).

    • YOLOS : "You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection". (NeurIPS 2021)

    • DAMO-YOLO : DAMO-YOLO: a fast and accurate object detection method with some new techs, including NAS backbones, efficient RepGFPN, ZeroHead, AlignedOTA, and distillation enhancement. "DAMO-YOLO : A Report on Real-Time Object Detection Design". (arXiv 2022)

    • YOLO-NAS : Easily train or fine-tune SOTA computer vision models with one open source training library. The home of Yolo-NAS. www.supergradients.com. YOLO-NAS and YOLO-NAS-POSE architectures are out! The new YOLO-NAS delivers state-of-the-art performance with the unparalleled accuracy-speed performance, outperforming other models such as YOLOv5, YOLOv6, YOLOv7 and YOLOv8.

    • LeYOLO : "LeYOLO, New Scalable and Efficient CNN Architecture for Object Detection". (arXiv 2024)

    • DynamicDet : "DynamicDet: A Unified Dynamic Architecture for Object Detection". (CVPR 2023)

    • DINO : "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection". (ICLR 2023).

    • GroundingDINO : "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection". (ECCV 2024).

    • RT-DETR | RT-DETRv2 : "DETRs Beat YOLOs on Real-time Object Detection". (CVPR 2024). "RT-DETRv2: Improved Baseline with Bag-of-Freebies for Real-Time Detection Transformer". (arXiv 2024).

    • EasyCV : An all-in-one toolkit for computer vision. "YOLOX-PAI: An Improved YOLOX, Stronger and Faster than YOLOv6". (arXiv 2022).

    • YOLACT & YOLACT++ : You Only Look At CoefficienTs. (ICCV 2019, IEEE TPAMI 2020)

    • Alpha-IoU : "Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression". (NeurIPS 2021)

    • CIoU : Complete-IoU (CIoU) Loss and Cluster-NMS for Object Detection and Instance Segmentation (YOLACT). (AAAI 2020, IEEE TCYB 2021)

    • Albumentations : Albumentations is a Python library for image augmentation. Image augmentation is used in deep learning and computer vision tasks to increase the quality of trained models. The purpose of image augmentation is to create new training samples from the existing data. "Albumentations: Fast and Flexible Image Augmentations". (Information 2020)

    • doubleZ0108/Data-Augmentation : General Data Augmentation Algorithms for Object Detection(esp. Yolo).

  • Awesome List

  • Paper and Code Overview

    • Paper Review

    • Code Review

      • iscyy/ultralyticsPro : 🔥🔥🔥 专注于YOLO11,YOLOv8、YOLOv10、RT-DETR、YOLOv7、YOLOv5改进模型,Support to improve backbone, neck, head, loss, IoU, NMS and other modules🚀

      • MMDetection : OpenMMLab Detection Toolbox and Benchmark. mmdetection.readthedocs.io. (arXiv 2019)

      • MMYOLO : OpenMMLab YOLO series toolbox and benchmark. Implemented RTMDet, RTMDet-Rotated,YOLOv5, YOLOv6, YOLOv7, YOLOv8, YOLOX, PPYOLOE, etc. mmyolo.readthedocs.io/zh_CN/dev/

      • iscyy/yoloair : 🔥🔥🔥 专注于YOLO改进模型,Support to improve backbone, neck, head, loss, IoU, NMS and other modules🚀. YOLOAir是一个基于PyTorch的YOLO算法库。统一模型代码框架、统一应用、统一改进、易于模块组合、构建更强大的网络模型。

      • iscyy/yoloair2 : ☁️💡🎈专注于改进YOLOv7,Support to improve Backbone, Neck, Head, Loss, IoU, NMS and other modules.

      • jizhishutong/YOLOU : YOLOU:United, Study and easier to Deploy. ​ The purpose of our creation of YOLOU is to better learn the algorithms of the YOLO series and pay tribute to our predecessors. YOLOv3、YOLOv4、YOLOv5、YOLOv5-Lite、YOLOv6-v1、YOLOv6-v2、YOLOv7、YOLOX、YOLOX-Lite、PP-YOLOE、PP-PicoDet-Plus、YOLO-Fastest v2、FastestDet、YOLOv5-SPD、TensorRT、NCNN、Tengine、OpenVINO. "微信公众号「集智书童」《YOLOU开源 | 汇集YOLO系列所有算法,集算法学习、科研改进、落地于一身!》"

      • WangQvQ/Yolov5_Magic : YOLO Magic🪄 is an extension based on Ultralytics' YOLOv5, designed to provide more powerful functionality and simpler operations for visual tasks.

      • positive666/yolo_research : 🚀 yolo_reserach PLUS High-level. based on yolo-high-level project (detect\pose\classify\segment):include yolov5\yolov7\yolov8\ core ,improvement research ,SwintransformV2 and Attention Series. training skills, business customization, engineering deployment.

      • augmentedstartups/AS-One : Easy & Modular Computer Vision Detectors and Trackers - Run YOLO-NAS,v8,v7,v6,v5,R,X in under 20 lines of code. www.augmentedstartups.com

      • Oneflow-Inc/one-yolov5 : A more efficient yolov5 with oneflow backend 🎉🎉🎉. "微信公众号「GiantPandaCV」《One-YOLOv5 发布,一个训得更快的YOLOv5》"

      • PaddlePaddle/PaddleYOLO : 🚀🚀🚀 YOLO series of PaddlePaddle implementation, PP-YOLOE+, YOLOv5, YOLOv6, YOLOv7, YOLOv8, YOLOX, YOLOv5u, YOLOv7u, RTMDet and so on. 🚀🚀🚀

      • WangRongsheng/BestYOLO : 🌟Change the world, it will become a better place. | 以科研和竞赛为导向的最好的YOLO实践框架!

      • KangChou/Cver4s : Cver4s:Computer vision algorithm code base.

      • chaizwj/yolov8-tricks : 目标检测,采用yolov8作为基准模型,数据集采用VisDrone2019,带有自己的改进策略。

  • Learning Resources

Other Versions of YOLO

Lighter and Deployment Frameworks

Applications

Datasets

  • Datasets Share Platform

    • OpenDataLab : OpenDataLab 是上海人工智能实验室的大模型数据基座团队打造的数据开放平台,现已成为**大模型语料数据联盟开源数据服务指定平台,为开发者提供全链条的 AI 数据支持,应对和解决数据处理中的风险与挑战,推动 AI 研究及应用。

    • Science Data Bank(ScienceDB) : Make your research data citable, discoverable and persistently accessible Satisfy flexible data sharing requirements Dedicate to facilitating data dissemination and reusing. Science Data Bank (ScienceDB) is a public, general-purpose data repository aiming to provide data services (e.g. data acquisition, long-term preservation, publishing, sharing and access) for researchers, research projects/teams, journals, institutions, universities, etc. It supports a variety of data acquisition and data licenses. ScienceDB is dedicated to promoting data findable, citable and reusable on the prerequisite of protecting the rights and interests of data owners and it is built and operated by Computer Network Information Center, Chinese Academy of Sciences.

    • **科学数据 : 《**科学数据(中英文网络版)》(China Scientific Data)(CN11-6035/N,ISSN 2096-2223)是目前**唯一的专门面向多学科领域科学数据出版的学术期刊,作为国家网络连续型出版物的首批试点之一,由**科学院主管,**科学院计算机网络信息中心和ISC CODATA**全国委员会合办,国家科技基础条件平台中心、**科学院网络安全和信息化领导小组办公室指导,国内外公开发行,中英文,季刊。 **科学引文数据库(CSCD)来源期刊,**科技核心期刊 ,收录于**科协高质量科技期刊分级目录。

    • 飞桨AI Studio : 飞桨AI Studio开放数据集。

    • 极市开发者平台 : 极市开发者平台开放数据集。

    • openvinotoolkit/datumaro : Dataset Management Framework, a Python library and a CLI tool to build, analyze and manage Computer Vision datasets.

  • Datasets Tools

    • Data Annotation

      • Label Studio : Label Studio is a multi-type data labeling and annotation tool with standardized output format. labelstud.io

      • X-AnyLabeling : Effortless data labeling with AI support from Segment Anything and other awesome models.

      • AnyLabeling : Effortless AI-assisted data labeling with AI support from YOLO, Segment Anything (SAM+SAM2), MobileSAM!! AnyLabeling = LabelImg + Labelme + Improved UI + Auto-labeling. anylabeling.nrl.ai

      • SAMLabelerPro : label your image with Segment Anything Model or MobileSAM, support remote labeling for multiple persons。使用Segment Anything Model或MobileSAM辅助标注的工具,支持多人远程标注。

      • LabelImg : 🖍️ LabelImg is a graphical image annotation tool and label object bounding boxes in images.

      • labelme : Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).

      • DarkLabel : Video/Image Labeling and Annotation Tool.

      • AlexeyAB/Yolo_mark : GUI for marking bounded boxes of objects in images for training neural network Yolo v3 and v2.

      • Cartucho/OpenLabeling : Label images and video for Computer Vision applications.

      • CVAT : Computer Vision Annotation Tool (CVAT). Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.

      • VoTT : Visual Object Tagging Tool: An electron app for building end to end Object Detection Models from Images and Videos.

      • WangRongsheng/KDAT : 一个专为视觉方向目标检测全流程的标注工具集,全称:Kill Object Detection Annotation Tools。

      • Rectlabel-support : RectLabel - An image annotation tool to label images for bounding box object detection and segmentation.

      • cnyvfang/labelGo-Yolov5AutoLabelImg : 💕YOLOV5 semi-automatic annotation tool (Based on labelImg)💕一个基于labelImg及YOLOV5的图形化半自动标注工具。

      • CVUsers/Auto_maker : 深度学习数据自动标注器开源 目标检测和图像分类(高精度高效率)。

      • MyVision : Computer vision based ML training data generation tool 🚀

      • wufan-tb/AutoLabelImg : auto-labelimg based on yolov5, with many other useful tools. AutoLabelImg 多功能自动标注工具。

      • MrZander/YoloMarkNet : Darknet YOLOv2/3 annotation tool written in C#/WPF.

      • mahxn0/Yolov3_ForTextLabel : 基于yolov3的目标/自然场景文字自动标注工具。

      • MNConnor/YoloV5-AI-Label : YoloV5 AI Assisted Labeling.

      • LILINOpenGitHub/Labeling-Tool : Free YOLO AI labeling tool. YOLO AI labeling tool is a Windows app for labeling YOLO dataset.

      • whs0523003/YOLOv5_6.1_autolabel : YOLOv5_6.1 自动标记目标框。

      • 2vin/PyYAT : Semi-Automatic Yolo Annotation Tool In Python.

      • AlturosDestinations/Alturos.ImageAnnotation : A collaborative tool for labeling image data for yolo.

      • stephanecharette/DarkMark : Marking up images for use with Darknet.

      • 2vin/yolo_annotation_tool : Annotation tool for YOLO in opencv.

      • sanfooh/quick_yolo2_label_tool : yolo快速标注工具 quick yolo2 label tool.

      • folkien/yaya : YAYA - Yet annother YOLO annoter for images (in QT5). Support yolo format, image modifications, labeling and detecting with previously trained detector.

      • pylabel-project/pylabel : Python library for computer vision labeling tasks. The core functionality is to translate bounding box annotations between different formats-for example, from coco to yolo.

      • opendatalab/labelU : Uniform, Unlimited, Universal and Unbelievable Annotation Toolbox.

    • Data Augmentation

      • Albumentations : Albumentations is a Python library for image augmentation. Image augmentation is used in deep learning and computer vision tasks to increase the quality of trained models. The purpose of image augmentation is to create new training samples from the existing data. "Albumentations: Fast and Flexible Image Augmentations". (Information 2020). albumentations.ai

      • doubleZ0108/Data-Augmentation : General Data Augmentation Algorithms for Object Detection(esp. Yolo).

    • Data Management

      • YOLOExplorer : YOLOExplorer : Iterate on your YOLO / CV datasets using SQL, Vector semantic search, and more within seconds. Explore, manipulate and iterate on Computer Vision datasets with precision using simple APIs. Supports SQL filters, vector similarity search, native interface with Pandas and more.
  • General Detection and Recognition Datasets

    • Object Detection Datasets

    • Object Recognition Datasets

  • Autonomous Driving Datasets

    • Diverse Autonomous Driving Datasets

      • BDD100K : "BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning". (CVPR 2020)

      • CODA : "CODA: A Real-World Road Corner Case Dataset for Object Detection in Autonomous Driving". (ECCV 2022)

    • Traffic Sign Detection Datasets

      • TT100K : "Traffic-Sign Detection and Classification in the Wild". (CVPR 2016)

      • CCTSDB : CSUST Chinese Traffic Sign Detection Benchmark **交通数据集由长沙理工大学综合交通运输大数据智能处理湖南省重点实验室张建明老师团队制作完成。 "A Real-Time Chinese Traffic Sign Detection Algorithm Based on Modified YOLOv2". (Algorithms, 2017)

      • CCTSDB2021 : "CCTSDB 2021: a more comprehensive traffic sign detection benchmark". (Human-centric Computing and Information Sciences, 2022)

    • License Plate Detection and Recognition Datasets

      • CCPD : "Towards End-to-End License Plate Detection and Recognition: A Large Dataset and Baseline". (ECCV 2018)
  • Adverse Weather Datasets

  • Person Detection Datasets

    Anti-UAV Datasets

    • Anti-UAV : 🔥🔥Official Repository for Anti-UAV🔥🔥. "Evidential Detection and Tracking Collaboration: New Problem, Benchmark and Algorithm for Robust Anti-UAV System". (arXiv 2023)
  • Optical Aerial Imagery Datasets

    • COWC : "A large contextual dataset for classification, detection and counting of cars with deep learning". (ECCV 2016)

    • RSOD : "Accurate object localization in remote sensing images based on convolutional neural networks". (IEEE TGRS 2017)

    • LEVIR : "Random access memories: A new paradigm for target detection in high resolution aerial remote sensing images". (IEEE Transactions on Image Processing 2017)

    • LEVIR-Ship : "A Degraded Reconstruction Enhancement-based Method for Tiny Ship Detection in Remote Sensing Images with A New Large-scale Dataset". (IEEE TGRS 2022)

    • MASATI : "Automatic ship classification from optical aerial images with convolutional neural networks". (Remote Sensing 2018)

    • xView : "xView: Objects in Context in Overhead Imagery". (arXiv 2018)

    • DOTA : "DOTA: A Large-Scale Dataset for Object Detection in Aerial Images". (CVPR 2018). "Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges". (IEEE TPAMI 2021).

    • ITCVD : "Deep Learning for Vehicle Detection in Aerial Images". (IEEE ICIP 2018)

    • Bridge Dataset : "A Tool for Bridge Detection in Major Infrastructure Works Using Satellite Images". (IEEE ICIP 2018)

    • DIOR : "Object detection in optical remote sensing images: A survey and a new benchmark". (ISPRS 2020)

    • PESMOD : "UAV Images Dataset for Moving Object Detection from Moving Cameras". (arXiv 2021)

    • AI-TOD : "Tiny Object Detection in Aerial Images". (IEEE ICPR 2021)

    • RsCarData : "DSFNet: Dynamic and Static Fusion Network for Moving Object Detection in Satellite Videos". (IEEE GRSL 2021)

    • VISO : "Detecting and Tracking Small and Dense Moving Objects in Satellite Videos: A Benchmark". (IEEE TGRS 2021)

    • VisDrone : "Detection and Tracking Meet Drones Challenge". (IEEE TPAMI 2021)

    • FAIR1M : "FAIR1M: A benchmark dataset for fine-grained object recognition in high-resolution remote sensing imagery". (ISPRS 2021)

    • SeaDronesSee : "SeaDronesSee: A Maritime Benchmark for Detecting Humans in Open Water". (WACV 2022)

  • Low-light Image Datasets

  • Infrared Image Datasets

  • SAR Image Datasets

  • Multispectral Image Datasets

    • FLIR_ADAS : Teledyne FLIR Free ADAS Thermal Dataset v2.

    • VEDAI : "Vehicle Detection in Aerial Imagery: A small target detection benchmark". (Journal of Visual Communication and Image Representation 2015)

    • KAIST_rgbt : "Multispectral Pedestrian Detection: Benchmark Dataset and Baseline". (CVPR 2015)

    • TNO : "The TNO multiband image data collection". (Data in brief, 2017)

    • MFNet : MFNet-pytorch, image semantic segmentation using RGB-Thermal images. "MFNet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes". (IROS 2017). (MFNet Dataset : Multi-spectral Object Detection and Semantic Segmentation Datasets)

    • LLVIP : "LLVIP: A Visible-Infrared Paired Dataset for Low-Light Vision". (ICCV 2021)

    • MSRS : MSRS: Multi-Spectral Road Scenarios for Practical Infrared and Visible Image Fusion. "PIAFusion : A progressive infrared and visible image fusion network based on illumination aware". (Information Fusion, 2022)

    • TarDAL : "Target-Aware Dual Adversarial Learning and a Multi-Scenario Multi-Modality Benchmark To Fuse Infrared and Visible for Object Detection". (CVPR 2022). (M3FD Dataset)

    • DroneVehicle : "Drone-based RGB-Infrared Cross-Modality Vehicle Detection via Uncertainty-Aware Learning". (IEEE TCSVT 2022)

  • 3D Object Detection Datasets

    • Objectron : "Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild with Pose Annotations". (CVPR, 2021)
  • Vehicle-to-Everything Field Datasets

  • Super-Resolution Field Datasets

    • VideoLQ : "Investigating Tradeoffs in Real-World Video Super-Resolution". (CVPR, 2022)
  • Face Detection and Recognition Datasets

Blogs

Videos

Star History

Star History Chart