A wide range of custom functions for YOLOv4, YOLOv4-tiny, YOLOv3, and YOLOv3-tiny implemented in TensorFlow, TFLite and TensorRT.
DISCLAIMER: This repository is very similar to my repository: tensorflow-yolov4-tflite. I created this repository to explore coding custom functions to be implemented with YOLOv4, and they may worsen the overal speed of the application and make it not optimized in respect to time complexity. So if you want to run the most optimal YOLOv4 code with TensorFlow than head over to my other repository. This one is to explore cool customizations and applications that can be created using YOLOv4!
- Counting Objects (total objects and per class)
- Print Info About Each Detection (class, confidence, bounding box coordinates)
- Crop Detections and Save as New Image
- License Plate Recognition Using Tesseract OCR
- Apply Tesseract OCR to Detections to Extract Text
If there is a custom function you want to see created then create an issue in the issues tab and suggest it! If enough people suggest the same custom function I will add it quickly!
# Tensorflow CPU
conda env create -f conda-cpu.yml
conda activate yolov4-cpu
# Tensorflow GPU
conda env create -f conda-gpu.yml
conda activate yolov4-gpu
# TensorFlow CPU
pip install -r requirements.txt
# TensorFlow GPU
pip install -r requirements-gpu.txt
Make sure to use CUDA Toolkit version 10.1 as it is the proper version for the TensorFlow version used in this repository. https://developer.nvidia.com/cuda-10.1-download-archive-update2
YOLOv4 comes pre-trained and able to detect 80 classes. For easy demo purposes we will use the pre-trained weights. Download pre-trained yolov4.weights file: https://drive.google.com/open?id=1cewMfusmPjYWbrnuJRuKhPMwRe_b9PaT
Copy and paste yolov4.weights from your downloads folder into the 'data' folder of this repository.
If you want to use yolov4-tiny.weights, a smaller model that is faster at running detections but less accurate, download file here: https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-tiny.weights
Learn How To Train Custom YOLOv4 Weights here: https://www.youtube.com/watch?v=mmj3nxGT2YQ
Watch me Walk-Through using Custom Model in TensorFlow :https://www.youtube.com/watch?v=nOIVxi5yurE
USE MY LICENSE PLATE TRAINED CUSTOM WEIGHTS: https://drive.google.com/file/d/1EUPtbtdF0bjRtNjGv436vDY28EN5DXDH/view?usp=sharing
Copy and paste your custom .weights file into the 'data' folder and copy and paste your custom .names into the 'data/classes/' folder.
The only change within the code you need to make in order for your custom model to work is on line 14 of 'core/config.py' file. Update the code to point at your custom .names file as seen below. (my custom .names file is called custom.names but yours might be named differently)
Note: If you are using the pre-trained yolov4 then make sure that line 14 remains coco.names.
To implement YOLOv4 using TensorFlow, first we convert the .weights into the corresponding TensorFlow model files and then run the model.
# Convert darknet weights to tensorflow
## yolov4
python save_model.py --weights ./data/yolov4.weights --output ./checkpoints/yolov4-416 --input_size 416 --model yolov4
# Run yolov4 tensorflow model
python detect.py --weights ./checkpoints/yolov4-416 --size 416 --model yolov4 --images ./data/images/kite.jpg
# Run yolov4 on video
python detect_video.py --weights ./checkpoints/yolov4-416 --size 416 --model yolov4 --video ./data/video/video.mp4 --output ./detections/results.avi
# Run yolov4 on webcam
python detect_video.py --weights ./checkpoints/yolov4-416 --size 416 --model yolov4 --video 0 --output ./detections/results.avi
If you want to run yolov3 or yolov3-tiny change --model yolov3
and .weights file in above commands.
Note: You can also run the detector on multiple images at once by changing the --images flag like such --images "./data/images/kite.jpg, ./data/images/dog.jpg"
You can find the outputted image(s) showing the detections saved within the 'detections' folder.
Video saves wherever you point --output flag to. If you don't set the flag then your video will not be saved with detections on it.
The following commands will allow you to run yolov4-tiny model.
# yolov4-tiny
python save_model.py --weights ./data/yolov4-tiny.weights --output ./checkpoints/yolov4-tiny-416 --input_size 416 --model yolov4 --tiny
# Run yolov4-tiny tensorflow model
python detect.py --weights ./checkpoints/yolov4-tiny-416 --size 416 --model yolov4 --images ./data/images/kite.jpg --tiny
The following commands will allow you to run your custom yolov4 model. (video and webcam commands work as well)
# custom yolov4
python save_model.py --weights ./data/custom.weights --output ./checkpoints/custom-416 --input_size 416 --model yolov4
# Run custom yolov4 tensorflow model
python detect.py --weights ./checkpoints/custom-416 --size 416 --model yolov4 --images ./data/images/car.jpg
Here is how to use all the currently supported custom functions and flags that I have created.
I have created a custom function within the file core/functions.py that can be used to count and keep track of the number of objects detected at a given moment within each image or video. It can be used to count total objects found or can count number of objects detected per class.
To count total objects all that is needed is to add the custom flag "--count" to your detect.py or detect_video.py command.
# Run yolov4 model while counting total objects detected
python detect.py --weights ./checkpoints/yolov4-416 --size 416 --model yolov4 --images ./data/images/dog.jpg --count
Running the above command will count the total number of objects detected and output it to your command prompt or shell as well as on the saved detection as so:
To count the number of objects for each individual class of your object detector you need to add the custom flag "--count" as well as change one line in the detect.py or detect_video.py script. By default the count_objects function has a parameter called by_class that is set to False. If you change this parameter to True it will count per class instead.
To count per class make detect.py or detect_video.py look like this:
Then run the same command as above:
# Run yolov4 model while counting objects per class
python detect.py --weights ./checkpoints/yolov4-416 --size 416 --model yolov4 --images ./data/images/dog.jpg --count
Running the above command will count the number of objects detected per class and output it to your command prompt or shell as well as on the saved detection as so:
Note: You can add the --count flag to detect_video.py commands as well!
I have created a custom flag called INFO that can be added to any detect.py or detect_video.py commands in order to print detailed information about each detection made by the object detector. To print the detailed information to your command prompt just add the flag --info
to any of your commands. The information on each detection includes the class, confidence in the detection and the bounding box coordinates of the detection in xmin, ymin, xmax, ymax format.
If you want to edit what information gets printed you can edit the draw_bbox function found within the core/utils.py file. The line that prints the information looks as follows:
Example of info flag added to command:
python detect.py --weights ./checkpoints/yolov4-416 --size 416 --model yolov4 --images ./data/images/dog.jpg --info
Resulting output within your shell or terminal:
Note: You can add the --info flag to detect_video.py commands as well!
I have created a custom function within the file core/functions.py that can be applied to any detect.py or detect_video.py commands in order to crop the YOLOv4 detections and save them each as their own new image. To crop detections all you need to do is add the --crop
flag to any command. The resulting cropped images will be saved within the detections/crop/ folder.
Example of crop flag added to command:
python detect.py --weights ./checkpoints/yolov4-416 --size 416 --model yolov4 --images ./data/images/dog.jpg --crop
Here is an example of one of the resulting cropped detections from the above command.
I have created a custom function to feed Tesseract OCR the bounding box regions of license plates found by my custom YOLOv4 model in order to read and extract the license plate numbers. Thorough preprocessing is done on the license plate in order to correctly extract the license plate number from the image. The function that is in charge of doing the preprocessing and text extraction is called recognize_plate and can be found in the file core/utils.py.
Disclaimer: In order to run tesseract OCR you must first download the binary files and set them up on your local machine. Please do so before proceeding or commands will not run as expected!
Official Tesseract OCR Github Repo: tesseract-ocr/tessdoc
Great Article for How To Install Tesseract on Mac or Linux Machines: PyImageSearch Article
For Windows I recommend: Windows Install
Once you have Tesseract properly installed you can move onwards. If you don't have a trained YOLOv4 model to detect license plates feel free to use one that I have trained. It is not perfect but it works well. Download license plate detector model and learn how to save and run it with TensorFlow here
The license plate recognition works wonders on images. All you need to do is add the --plate
flag on top of the command to run the custom YOLOv4 model.
Try it out on this image in the repository!
# Run License Plate Recognition
python detect.py --weights ./checkpoints/custom-416 --size 416 --model yolov4 --images ./data/images/car2.jpg --plate
The output from the above command should print any license plate numbers found to your command terminal as well as output and save the following image to the detections
folder.
You should be able to see the license plate number printed on the screen above the bounding box found by YOLOv4.
This section will highlight the steps I took in order to implement the License Plate Recognition with YOLOv4 and potential areas to be worked on further.
This demo will be showing the step-by-step workflow on the following original image.
First step of the process is taking the bounding box coordinates from YOLOv4 and simply taking the subimage region within the bounds of the box. Since this image is super small the majority of the time we use cv2.resize() to blow the image up 3x its original size.
Then we convert the image to grayscale and apply a small Gaussian blur to smooth it out.
Following this, the image is thresholded to white text with black background and has Otsu's method also applied. This white text on black background helps to find contours of image.
The image is then dilated using opencv in order to make contours more visible and be picked up in future step.
Next we use opencv to find all the rectangular shaped contours on the image and sort them left to right.
As you can see this causes many contours to be found other than just the contours of each character within the license plate number. In order to filter out the unwanted regions we apply a couple parameters to be met in order to accept a contour. These parameters are just height and width ratios (i.e. the height of region must be at least 1/6th of the total height of the image). A couple other parameters on area of region etc are also placed. Check out code to see exact details. This filtering leaves us with.
The individual characters of the license plate number are now the only regions of interest left. We segment each subimage and apply a bitwise_not mask to flip the image to black text on white background which Tesseract is more accurate with. The final step is applying a small median blur on the image and then it is passed to Tesseract to get the letter or number from it. Example of how letters look like when going to tesseract.
Each letter or number is then just appended together into a string and at the end you get the full license plate that is recognized! BOOM!
Running the license plate recognition straight on video at the same time that YOLOv4 object detections causes a few issues. Tesseract OCR is fairly expensive in terms of time complexity and slows down the processing of the video to a snail's pace. It can still be accomplished by adding the --plate
command line flag to any detect_video.py commands.
However, I believe the best route to go is to run video detections without the plate flag and instead run them with --crop
flag which crops the objects found on screen and saves them as new images. See how it works here Once the video is done processing at a higher FPS all the license plate images will be cropped and saved within detections/crop folder. I have added an easy script within the repository called license_plate_recognizer.py that you can run in order to recognize license plates. Plus this allows you to easily customize the script to further enhance any recognitions. I will be working on linking this functionality automatically in future commits to the repository.
Running License Plate Recognition with detect_video.py is done with the following command.
python detect_video.py --weights ./checkpoints/custom-416 --size 416 --model yolov4 --video ./data/video/license_plate.mp4 --output ./detections/recognition.avi --plate
The recommended route I think is more efficient is using this command. Customize the rate at which detections are cropped within the code itself.
python detect_video.py --weights ./checkpoints/custom-416 --size 416 --model yolov4 --video ./data/video/license_plate.mp4 --output ./detections/recognition.avi --crop
Now play around with license_plate_recognizer.py and have some fun!
I have also implemented a generic use of Tesseract OCR with YOLOv4. By enabling the flag --ocr
with any detect.py image command you can search detections for text and extract what is found. Generic preprocessing is applied on the subimage that makes up the inside of the detection bounding box. However, so many lighting or color issues require advanced preprocessing so this function is by no means perfect. You will also need to install tesseract on your local machine prior to running this flag (see links and suggestions in above section)
Example command (note this image doesn't have text so will not output anything, just meant to show how command is structured):
python detect.py --weights ./checkpoints/yolov4-416 --size 416 --model yolov4 --images ./data/images/dog.jpg --ocr
Can also implement YOLOv4 using TensorFlow Lite. TensorFlow Lite is a much smaller model and perfect for mobile or edge devices (raspberry pi, etc).
# Save tf model for tflite converting
python save_model.py --weights ./data/yolov4.weights --output ./checkpoints/yolov4-416 --input_size 416 --model yolov4 --framework tflite
# yolov4
python convert_tflite.py --weights ./checkpoints/yolov4-416 --output ./checkpoints/yolov4-416.tflite
# yolov4 quantize float16
python convert_tflite.py --weights ./checkpoints/yolov4-416 --output ./checkpoints/yolov4-416-fp16.tflite --quantize_mode float16
# yolov4 quantize int8
python convert_tflite.py --weights ./checkpoints/yolov4-416 --output ./checkpoints/yolov4-416-int8.tflite --quantize_mode int8 --dataset ./coco_dataset/coco/val207.txt
# Run tflite model
python detect.py --weights ./checkpoints/yolov4-416.tflite --size 416 --model yolov4 --images ./data/images/kite.jpg --framework tflite
You can find the outputted image(s) showing the detections saved within the 'detections' folder.
Yolov4 and Yolov4-tiny int8 quantization have some issues. I will try to fix that. You can try Yolov3 and Yolov3-tiny int8 quantization
Can also implement YOLOv4 using TensorFlow's TensorRT. TensorRT is a high-performance inference optimizer and runtime that can be used to perform inference in lower precision (FP16 and INT8) on GPUs. TensorRT can allow up to 8x higher performance than regular TensorFlow.
python save_model.py --weights ./data/yolov3.weights --output ./checkpoints/yolov3.tf --input_size 416 --model yolov3
python convert_trt.py --weights ./checkpoints/yolov3.tf --quantize_mode float16 --output ./checkpoints/yolov3-trt-fp16-416
# yolov3-tiny
python save_model.py --weights ./data/yolov3-tiny.weights --output ./checkpoints/yolov3-tiny.tf --input_size 416 --tiny
python convert_trt.py --weights ./checkpoints/yolov3-tiny.tf --quantize_mode float16 --output ./checkpoints/yolov3-tiny-trt-fp16-416
# yolov4
python save_model.py --weights ./data/yolov4.weights --output ./checkpoints/yolov4.tf --input_size 416 --model yolov4
python convert_trt.py --weights ./checkpoints/yolov4.tf --quantize_mode float16 --output ./checkpoints/yolov4-trt-fp16-416
python detect.py --weights ./checkpoints/yolov4-trt-fp16-416 --model yolov4 --images ./data/images/kite.jpg --framework trt
save_model.py:
--weights: path to weights file
(default: './data/yolov4.weights')
--output: path to output
(default: './checkpoints/yolov4-416')
--[no]tiny: yolov4 or yolov4-tiny
(default: 'False')
--input_size: define input size of export model
(default: 416)
--framework: what framework to use (tf, trt, tflite)
(default: tf)
--model: yolov3 or yolov4
(default: yolov4)
detect.py:
--images: path to input images as a string with images separated by ","
(default: './data/images/kite.jpg')
--output: path to output folder
(default: './detections/')
--[no]tiny: yolov4 or yolov4-tiny
(default: 'False')
--weights: path to weights file
(default: './checkpoints/yolov4-416')
--framework: what framework to use (tf, trt, tflite)
(default: tf)
--model: yolov3 or yolov4
(default: yolov4)
--size: resize images to
(default: 416)
--iou: iou threshold
(default: 0.45)
--score: confidence threshold
(default: 0.25)
--count: count objects within images
(default: False)
--dont_show: dont show image output
(default: False)
--info: print info on detections
(default: False)
--crop: crop detections and save as new images
(default: False)
detect_video.py:
--video: path to input video (use 0 for webcam)
(default: './data/video/video.mp4')
--output: path to output video (remember to set right codec for given format. e.g. XVID for .avi)
(default: None)
--output_format: codec used in VideoWriter when saving video to file
(default: 'XVID)
--[no]tiny: yolov4 or yolov4-tiny
(default: 'false')
--weights: path to weights file
(default: './checkpoints/yolov4-416')
--framework: what framework to use (tf, trt, tflite)
(default: tf)
--model: yolov3 or yolov4
(default: yolov4)
--size: resize images to
(default: 416)
--iou: iou threshold
(default: 0.45)
--score: confidence threshold
(default: 0.25)
--count: count objects within video
(default: False)
--dont_show: dont show video output
(default: False)
--info: print info on detections
(default: False)
--crop: crop detections and save as new images
(default: False)
Huge shoutout goes to hunglc007 for creating the backbone of this repository: