This GitHub repository contains Jupyter notebooks that showcase simple object detection using YOLOv3 and Tiny YOLOv3 models. The notebooks demonstrate how to apply these models to both images and video files, and provide step-by-step instructions for implementing the object detection algorithm. Whether you're new to deep learning or just want to learn more about YOLOv3, this repository provides a great starting point for experimenting with object detection.
Table of Contents
- YOLOv3_img_simple_object_detection.ipynb : Jupyter notebook for object detection on image files with YOLOv3 and YOLOv3-tiny.
- YOLOv3_video_simple_object_detection.ipynb : Jupyter notebook for object detection on video files with YOLOv3 and YOLOv3-tiny.
- configs : contains cfg files for yolov3 and tiny yolov3.
- result imgs : contains results of object detection on image files.
- result vids : contains results of object detection on video files in .avi format.
- gifs : contains results of object detection on video files in GIF format..
- test imgs : contains images of random scenes.
- test vids : contains videos of random scenes.
- coco.names : The "coco.names" file is a plain text file that contains the names of the 80 object classes in the Microsoft Common Objects in Context (COCO) dataset.
The mAP (mean average precision) is a metric used to evaluate the performance of object detection models. It measures the accuracy of the model in terms of both precision (the fraction of true positives out of all positive predictions) and recall (the fraction of true positives out of all actual positives).
GFlops (GigaFLOPS) is a measure of computational power, specifically the number of floating-point operations per second that a computer or device can perform. In the context of deep learning models, GFlops are often used as a measure of the computational complexity of the model.
-
YOLOv3-320 has a mAP of 55.3 for 65.86 GFlops which means the model is able on average to correctly detect and identify 55.3% of the objects present in a given image and require a compuational power of 65.86 billion floating point operations per second. We can conclude that the object detection model is relatively accurate and moderately complex. It may be suitable for some applications but may not be efficient enough for applications with strict real-time performance requirements if we dont have access to high performing GPUs, which is further confirmed by the results in the above section.
-
YOLOv3-tiny has a mAP of 33.1 for 5.56 GFlops which means the model is able on average to correctly detect and identify 33.1% of the objects present in a given image and require a compuational power of 5.56 billion floating point operations per second. We can conclude that the object detection model is is moderately accurate and relatively simple compared to the YOLOv3-320.Therefore it is h
In summary, YOLOv3 and YOLOv3 Tiny are both object detection models, but YOLOv3 Tiny is a smaller and faster version of YOLOv3, designed for use in scenarios where real-time object detection is required on lower-end hardware. Although YOLOv3 Tiny sacrifices some accuracy compared to YOLOv3, it can still achieve good performance in many real-world applications while using fewer computational resources.