This repo works with TensorFlow 2.3 and Keras 2.4. This repo builds on the very successful TrainYourOwnYOLO repo maintained by Anton Mu. His repo lets you train a custom image detector using the state-of-the-art YOLOv3 computer vision algorithm. For a short write up check out this medium post. This repo brings everything TrainYourOwnYOLO does, and on top, it allows you to detect objects in multiple streams, with multiple GPUs, and with multiple models, all at the same time, all in parallel in multiple independent Python processes.
The number of streams depends on the amount of memory available on the GPU and in your computer. A YOLO process demands around a gigabyte of GPU memory, therefore, 11 streams can ideally be squeezed into a Geforce 1080ti with 11 Gbytes (it will be a very tight fit.) This is achieved with a modified YOLO object. A more in-depth description is here.
This repo comes with a very early version of MultiDetect.py, an application that makes use of the multi-stream, multi-GPU YOLO object. MultiDetect.py allows you to manage multiple streams and GPUs, to display the output on one or many monitors, and to automatically record video and attendant data files. A detailed description of MultiDetect.py is here.
Both the modified YOLO process and MultiDetect.py are written in pure Python 3.7. They integrate with TrainYourOwnYOLO, use the same models, workflows, file and directory structures. This version targets the Linux platform only. I have not yet tested it on Windows. I do not have access to a MacOS machine, please help. Let's work on it a bit, shake the bugs out, and then offer it as a merge.
You can create as many independent YOLO video streams as your GPU can stomach
Changes made to TrainYourOwnYOLO
- Modified yolo.py to allow for multiple YOLO instances on one, or multiple GPUs. More here.
- Modified Train_YOLO.py and Train_utils.py to allow for a changed repo name, like the one in here
- MultiDetect.py w/ sundry support files to demo the new capabilities. More here.
- Facility to download ready-made cat model to demo new capabilities without a custom model
- Added sundry documentation files
To build and test your YOLO object detection algorithm follow the below steps:
- Image Annotation
- Install Microsoft's Visual Object Tagging Tool (VoTT)
- Annotate images
- Training
- Download pre-trained weights
- Train your custom YOLO model on annotated images
- Inference
- Detect objects in new images and videos
- Detect objects in parallel in multiple streams and on multiple GPUs
1_Image_Annotation
: Scripts and instructions on annotating images2_Training
: Scripts and instructions on training your YOLOv3 model3_Inference
: Scripts and instructions on testing your trained YOLO model on new images and videosData
: Input Data, Output Data, Model Weights and ResultsUtils
: Utility scripts used by main scripts
This repo has been tested with python 3.7 and python 3.8. To install python 3.7 go to
- python.org/downloads and follow the installation instructions. The modified YOLO object should work with python 3.6, MultiDetect.py uses features available only from python 3.7 on up.
This repo is focused on multiple video streams running on one or more GPUs, and hence, it is CUDA centric. As this repo is focused on multi-stream inference, no changes were made to the training part of TrainYourOwnYOLO. The modifications of the YOLO object do not address multi-GPU training. To install CUDA on your own machine, follow the instructions at tensorflow.org/install/gpu to install CUDA drivers. Make sure to install the correct version of CUDA and cuDNN. There also is a small CUDA crash course. Note: This repo has not been tested on anything else than pure metal GPUs.
MultiDetect.py offers you audible prompts. For that, it uses the pydub library. Pydub can't function without working audio either. If no audio is found, pydub will complain with:
RuntimeWarning: Couldn't find ffplay or avplay - defaulting to ffplay, but may not work
You can safely ignore the warning, or you can install ffmpeg:
sudo apt install ffmpeg
MultiDetect.py uses fonts available in most modern Ubuntu installations. If a font is not found, please install it.
You have two choices.
You can graft multistreamYOOLO upon an existing TrainYourOwnYOLO installation like so:
- Rename .../TrainYourOwnYOLO/2_Training/src/keras_yolo3/yolo.py to yolo.py.ori, and replace the file with the new version from this repo This is the modified YOLO object that does all the work. It should be a drop-in, bolt-on replacement, compatible with the current TrainYourOwnYOLO version.
- Add the complete content of .../TrainYourOwnYOLO/3_Inference, including the MDResource folder to .../TrainYourOwnYOLO/3_Inference. This brings in MultiDetect.py and a few attendant files. MultiDetect.conf is the config file of MultiDetect.py, and it's where the magic happens. See in-depth explanation here. There are a few conf file versions for multiple scenarios for you to play with. Edit to your use case and liking, and rename to MultiDetect.conf.
- Replace your current requirements.txt with the new requirements.txt in this repo
- Enter your virtualenv if you use one
- Run pip install -r requirements.txt , and you are good to go.
Clone this complete repo, follow the steps below, read MultiYOLO.md (docs for the modified YOLO object) and MultiDetect.md, and you are an expert.
Clone this repo with:
git clone https://github.com/bertelschmitt/multistreamYOLO/
cd multistreamYOLO/
Note: This repo so far has been developed and tested on Ubuntu (20.04, and 18.04) only. It has not been tested on Windows or Mac at all.
Important: This requires Ubuntu 3.7 (or 3.8) ! To avoid collisions with old python versions that can lead to strange error messages, it is strongly recommended to create a virtual environment where python is of version 3.7 or 3.8. No other python versions should exist in this virtual environment
Create Virtual (Linux) Environment (replace 3.7 with 3.8 as needed):
pip3.7 install virtualenv
python3.7 -m venv name-of-env
#Activate the env:
source name-of-env/bin/activate
The name of your virtualenv should appear in front of your prompt on the console, like
(name-of-env) your-user-name:~$
Now make sure that your virtualenv python is of version 3.7 (or 3.8):
python -V
If you don't see "Python 3.7.x" (or Python 3.8.x as needed) don't proceed, go back and fix it
For good measure, also type:
pip -V
You should see (python3.7) at the end of the result. If not, please check your work.
Make sure that, from now on, you run all commands from within your virtual environment. To automatically activate the virtual environment, put this at the end of your ~/.bashrc file. It is in your home directory.
source name-of-env/bin/activate
Start a new terminal window. You should see the name of your virtualenv in front of your prompt. If not, please fix it.
To avoid any collisions, your virtualenv should be clean, without any installed packages. To make sure that it is, type:
pip freeze
You should see nothing as a result. Now go to the base of this repo .../multistreamYOLO/, and install all required packages (from within your virtual environment) via:
pip install -r requirements.txt
If this fails, you may have to upgrade your pip version first with pip install pip --upgrade
.
To test the cat face detector on test images located in multistreamYOLO/Data/Source_Images/Test_Images
run the Minimal_Example.py
script in the root folder with:
python Minimal_Example.py
The outputs are saved in multistreamYOLO/Data/Source_Images/Test_Image_Detection_Results
. This includes:
- Cat pictures with bounding boxes around faces with confidence scores and
Detection_Results.csv
file with file names and locations of bounding boxes.
If you want to detect cat faces in your own pictures, replace the cat images in Data/Source_Images/Test_Images
with your own images.
To train your own custom YOLO object detector please follow the instructions detailed in the three numbered subfolders of this repo:
When your model(s) run, then venture forth to multiple streams, possibly even on multiple GPUs.
To make everything run smoothly it is highly recommended to keep the original folder structure of this repo!
Each *.py
script has various command line options that help tweak performance and change things such as input and output directories. All scripts are initialized with good default values that help accomplish all tasks as long as the original folder structure is preserved. To learn more about available command line options of a python script <script_name.py>
run:
python <script_name.py> -h
Unless explicitly stated otherwise at the top of a file, all code is licensed under the MIT license. This repo makes use of ilmonteux/logohunter which itself is inspired by qqwweee/keras-yolo3.
-
If you encounter any error, please make sure you follow the instructions exactly (word by word). Once you are familiar with the code, you're welcome to modify it as needed but in order to minimize error, I encourage you to not deviate from the instructions above. If you would like to file an issue, please use the provided template and make sure to fill out all fields.
-
If you encounter a
FileNotFoundError
,Module not found
or similar error, make sure that you did not change the folder structure. Your directory structure must look exactly like this:multistreamYOLO └─── 1_Image_Annotation └─── 2_Training └─── 3_Inference └─── Data └─── Utils
If you use a different name such as e.g.
multiastreamYOLO-master
you may have to specify the correct paths as command line arguments in every function call.Don't use spaces in file or folder names, i.e. instead of
my folder
usemy_folder
. -
If you are a Linux user and having trouble installing
*.snap
package files try:snap install --dangerous vott-2.1.0-linux.snap
See Snap Tutorial for more information.
-
MultiDetect.py may crash, hang up, and possibly freeze the computer. In most cases, this is caused by wrong settings. This is an experimental project, and you need to experiment to find the settings right for your use case. Start with only 2 simultaneous streams, and work your way up. Use one of the skeleton *conf files as a base
If you would like to file an issue, please use the provided issue template and make sure to complete all fields. This makes it easier to reproduce the issue for someone trying to help you.
Issues without a completed issue template will be closed after 7 days.
YOLO on all cylinders The YOLO object, tuned for multiple processes
CUDA crash course Some CUDA installation pointers
How much, how little memory. The best memory settings
Running inference Putting it to use
- ⭐ star this repo to get notifications on future improvements and
- 🍴 fork this repo if you like to use it as part of your own project.
This work is licensed under a Creative Commons Attribution 4.0 International License. This means that you are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
Under the following terms:
- Attribution
Cite as:
@misc{TrainYourOwnYOLO XXL,
title={TrainYourOwnYOLO XXL: Build a Custom Object Detector from Scratch, and run many in parallel},
authors={Anton Muehlemann, Anton Mu, Bertel Schmitt},
year={2019, 2020},
url={https://github.com/bertelschmitt/multistreamYOLO}
}
If your work doesn't include a citation list, simply link this github repo!