Darknet is an open source neural network framework written in C and CUDA. It is fast, easy to install, and supports CPU and GPU computation. Darkflow is a TensorFlow implementation of Darknet, which allows you to write your code in Python. You only look once (YOLO) is a state-of-the-art, real-time object detection system. On a Pascal Titan X it processes images at 30 FPS and has a mAP of 57.9% on COCO test-dev.
Paper: version 1, version 2. Read more about YOLO (in darknet) and download weight files here.
See demo below or see on this imgur
Python3, tensorflow 1.0, numpy, opencv 3.
Change into the directory where you downloaded Darkflow.
There are 3 methods of installing, and you only need to do one. I've found that pip install -e . worked best.
-
Let pip install darkflow globally in dev mode (still globally accessible, but changes to the code immediately take effect)
pip install -e .
-
Install with pip globally
pip install .
-
Build the Cython extensions in place. NOTE: If installing this way you will have to use
./flow
in the cloned darkflow directory instead offlow
as darkflow is not installed globally.python3 setup.py build_ext --inplace
A weight is the strength of the connection between nodes in a neural network. If you increase the input then how much influence does it have on the output. Weights near zero mean changing this input will not change the output. Weights and biases are the learnable parameters of your model. The values of these parameters before learning starts are initialised randomly (this stops them all converging to a single value). Then, when presented with data during training, they are adjusted towards values that have correct output. This is what is currently in these different weight files.
These can grow to 100mb+ per file, so for that reason they are not included in the repository. In case the weight file cannot be found on the Darknet site, the author of Darkflow uploaded some of his here, which include yolo-full and yolo-tiny of v1.0, tiny-yolo-v1.1 of v1.1 and yolo, tiny-yolo-voc of v2.
You will need to place all of the weights in the bin/
folder. In the end, your structure should look like this:
|- darkflow-master/
|--- bin/
|------ yolo.weights
|------ tiny-yolo.weights
|------ yolo-tiny.weights
|------ tiny-yolo-v1.1.weights
|------ tiny-yolo-voc.weights
|------ yolo3.weights
Skip this if you are not training or fine-tuning anything (you simply want to forward flow a trained net)
For example, if you want to work with only 3 classes tvmonitor
, person
, pottedplant
; edit labels.txt
as follows
tvmonitor
person
pottedplant
And that's it. darkflow
will take care of the rest. You can also set darkflow to load from a custom labels file with the --labels
flag (i.e. --labels myOtherLabelsFile.txt
). This can be helpful when working with multiple models with different sets of output labels. When this flag is not set, darkflow will load from labels.txt
by default (unless you are using one of the recognized .cfg
files designed for the COCO or VOC dataset - then the labels file will be ignored and the COCO or VOC labels will be loaded).
OPTIONAL: Skip this if you are working with one of the original configurations since they are already there. Otherwise, see the following example:
...
[convolutional]
batch_normalize = 1
size = 3
stride = 1
pad = 1
activation = leaky
[maxpool]
[connected]
output = 4096
activation = linear
...
# Have a look at its options
flow --h
First, let's take a closer look at one of a very useful option --load
# 1. Load .weights
flow --model cfg/v1/yolo-tiny.cfg --load bin/yolo-tiny.weights --savepb --verbalise
NOTE: If you see the error AssertionError: expect 64701556 bytes, found 180357512
that means your .cfg and .weights files do not match up. Notice that we are using the v1/tiny-yolo.cfg
file here, and NOT the tiny-yolo.cfg
file in the /cfg
folder. See Mikeknapp's answer to this issue
If all went well, you should see something similar to:
davevoyles@dv-dlvm-ubuntu:/tmp/mozilla_davevoyles0/darkflow-master$ flow --model cfg/v1/tiny-yolo.cfg --load bin/tiny-yolo.weights --savepb --verbalise
/anaconda/envs/py35/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
/tmp/mozilla_davevoyles0/darkflow-master/darkflow/dark/darknet.py:54: UserWarning: ./cfg/tiny-yolo.cfg not found, use cfg/v1/tiny-yolo.cfg instead
cfg_path, FLAGS.model))
Parsing cfg/v1/tiny-yolo.cfg
Loading bin/tiny-yolo.weights ...
Successfully identified 180357512 bytes
Finished in 0.00400090217590332s
Model has a VOC model name, loading VOC labels.
Building net ...
Source | Train? | Layer description | Output size
-------+--------+----------------------------------+---------------
| | input | (?, 448, 448, 3)
Load | Yep! | scale to (-1, 1) | (?, 448, 448, 3)
Load | Yep! | conv 3x3p1_1 leaky | (?, 448, 448, 16)
Load | Yep! | maxp 2x2p0_2 | (?, 224, 224, 16)
Load | Yep! | conv 3x3p1_1 leaky | (?, 224, 224, 32)
Load | Yep! | maxp 2x2p0_2 | (?, 112, 112, 32)
Load | Yep! | conv 3x3p1_1 leaky | (?, 112, 112, 64)
Load | Yep! | maxp 2x2p0_2 | (?, 56, 56, 64)
Load | Yep! | conv 3x3p1_1 leaky | (?, 56, 56, 128)
Load | Yep! | maxp 2x2p0_2 | (?, 28, 28, 128)
Load | Yep! | conv 3x3p1_1 leaky | (?, 28, 28, 256)
Load | Yep! | maxp 2x2p0_2 | (?, 14, 14, 256)
Load | Yep! | conv 3x3p1_1 leaky | (?, 14, 14, 512)
Load | Yep! | maxp 2x2p0_2 | (?, 7, 7, 512)
Load | Yep! | conv 3x3p1_1 leaky | (?, 7, 7, 1024)
Load | Yep! | conv 3x3p1_1 leaky | (?, 7, 7, 1024)
Load | Yep! | conv 3x3p1_1 leaky | (?, 7, 7, 1024)
Load | Yep! | flat | (?, 50176)
Load | Yep! | full 50176 x 256 linear | (?, 256)
Load | Yep! | full 256 x 4096 leaky | (?, 4096)
Load | Yep! | drop | (?, 4096)
Load | Yep! | full 4096 x 1470 linear | (?, 1470)
-------+--------+----------------------------------+---------------
Running entirely on CPU
2018-06-06 22:21:03.991220: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-06-06 22:21:09.417456: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212] Found device 0 with properties:
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 9340:00:00.0
totalMemory: 11.17GiB freeMemory: 11.10GiB
2018-06-06 22:21:09.417774: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
Finished in 36.41262149810791s
Rebuild a constant version ...
2018-06-06 22:21:37.019380: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-06-06 22:21:37.224531: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-06-06 22:21:37.224847: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10765 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 9340:00:00.0, compute capability: 3.7)
Done
Let's try running a new model, utilizing one of the .cfg files that came with Darkflow. You can replace tiny-yolo
with any of the other config files found in the cfg
or its folders.
# 2. To initialize a model, leave the --load option
# NOTE: The name is tiny-yolo.cfg now, and NOT tiny.yolo
flow --model cfg/tiny-yolo.cfg
This should return:
Building net ...
Source | Train? | Layer description | Output size
-------+--------+----------------------------------+---------------
| | input | (?, 416, 416, 3)
Init | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 416, 416, 16)
Load | Yep! | maxp 2x2p0_2 | (?, 208, 208, 16)
Init | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 208, 208, 32)
Load | Yep! | maxp 2x2p0_2 | (?, 104, 104, 32)
Init | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 104, 104, 64)
Load | Yep! | maxp 2x2p0_2 | (?, 52, 52, 64)
Init | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 52, 52, 128)
Load | Yep! | maxp 2x2p0_2 | (?, 26, 26, 128)
Init | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 26, 26, 256)
Load | Yep! | maxp 2x2p0_2 | (?, 13, 13, 256)
Init | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 13, 13, 512)
Load | Yep! | maxp 2x2p0_1 | (?, 13, 13, 512)
Init | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 13, 13, 1024)
Init | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 13, 13, 1024)
Init | Yep! | conv 1x1p0_1 linear | (?, 13, 13, 425)
-------+--------+----------------------------------+---------------
# 3. It is useful to reuse the first identical layers of tiny for `yolo-new`
# this will print out which layers are reused, which are initialized
flow --model cfg/v1/yolo-tiny.cfg --load bin/yolo-tiny.weights
Which should return:
Building net ...
Source | Train? | Layer description | Output size
-------+--------+----------------------------------+---------------
| | input | (?, 448, 448, 3)
Load | Yep! | scale to (-1, 1) | (?, 448, 448, 3)
Load | Yep! | conv 3x3p1_1 leaky | (?, 448, 448, 16)
Load | Yep! | maxp 2x2p0_2 | (?, 224, 224, 16)
Load | Yep! | conv 3x3p1_1 leaky | (?, 224, 224, 32)
Load | Yep! | maxp 2x2p0_2 | (?, 112, 112, 32)
Load | Yep! | conv 3x3p1_1 leaky | (?, 112, 112, 64)
Load | Yep! | maxp 2x2p0_2 | (?, 56, 56, 64)
Load | Yep! | conv 3x3p1_1 leaky | (?, 56, 56, 128)
Load | Yep! | maxp 2x2p0_2 | (?, 28, 28, 128)
Load | Yep! | conv 3x3p1_1 leaky | (?, 28, 28, 256)
Load | Yep! | maxp 2x2p0_2 | (?, 14, 14, 256)
Load | Yep! | conv 3x3p1_1 leaky | (?, 14, 14, 512)
Load | Yep! | maxp 2x2p0_2 | (?, 7, 7, 512)
Load | Yep! | conv 3x3p1_1 leaky | (?, 7, 7, 1024)
Load | Yep! | conv 3x3p1_1 leaky | (?, 7, 7, 1024)
Load | Yep! | conv 3x3p1_1 leaky | (?, 7, 7, 1024)
Load | Yep! | flat | (?, 50176)
Load | Yep! | full 50176 x 256 linear | (?, 256)
Load | Yep! | full 256 x 4096 leaky | (?, 4096)
Load | Yep! | drop | (?, 4096)
Load | Yep! | full 4096 x 1470 linear | (?, 1470)
-------+--------+----------------------------------+---------------
All input images from default folder sample_img/
are flowed through the net and predictions are put in sample_img/out/
. We can always specify more parameters for such forward passes, such as detection threshold, batch size, images folder, etc.
We can take advantage of the GPU by adding the gpu .50
flag as well, which will use 50% of the GPU's memory. I've found that setting it to 1.0 and using 100% of the memory can often cause issues with CUDA and throw an out of memory error, such as:
E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 11.17G (11996954624 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
You can also gather information abotu your GPU by entering this command:
nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.30 Driver Version: 390.30 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 0000BD3F:00:00.0 Off | 0 |
| N/A 40C P0 81W / 149W | 0MiB / 11441MiB | 96% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
# Forward all images in sample_img/ using tiny yolo and 100% GPU usage
flow --model cfg/v1/yolo-tiny.cfg --load bin/yolo-tiny.weights --imgdir sample_img/ --gpu .50
The output will look similar to this:
Parsing cfg/v1/yolo-tiny.cfg
Loading bin/yolo-tiny.weights ...
Successfully identified 180357512 bytes
Finished in 0.004160642623901367s
Model has a VOC model name, loading VOC labels.
Building net ...
Source | Train? | Layer description | Output size
-------+--------+----------------------------------+---------------
| | input | (?, 448, 448, 3)
Load | Yep! | scale to (-1, 1) | (?, 448, 448, 3)
Load | Yep! | conv 3x3p1_1 leaky | (?, 448, 448, 16)
Load | Yep! | maxp 2x2p0_2 | (?, 224, 224, 16)
Load | Yep! | conv 3x3p1_1 leaky | (?, 224, 224, 32)
Load | Yep! | maxp 2x2p0_2 | (?, 112, 112, 32)
Load | Yep! | conv 3x3p1_1 leaky | (?, 112, 112, 64)
Load | Yep! | maxp 2x2p0_2 | (?, 56, 56, 64)
Load | Yep! | conv 3x3p1_1 leaky | (?, 56, 56, 128)
Load | Yep! | maxp 2x2p0_2 | (?, 28, 28, 128)
Load | Yep! | conv 3x3p1_1 leaky | (?, 28, 28, 256)
Load | Yep! | maxp 2x2p0_2 | (?, 14, 14, 256)
Load | Yep! | conv 3x3p1_1 leaky | (?, 14, 14, 512)
Load | Yep! | maxp 2x2p0_2 | (?, 7, 7, 512)
Load | Yep! | conv 3x3p1_1 leaky | (?, 7, 7, 1024)
Load | Yep! | conv 3x3p1_1 leaky | (?, 7, 7, 1024)
Load | Yep! | conv 3x3p1_1 leaky | (?, 7, 7, 1024)
Load | Yep! | flat | (?, 50176)
Load | Yep! | full 50176 x 256 linear | (?, 256)
Load | Yep! | full 256 x 4096 leaky | (?, 4096)
Load | Yep! | drop | (?, 4096)
Load | Yep! | full 4096 x 1470 linear | (?, 1470)
-------+--------+----------------------------------+---------------
GPU mode with 0.5 usage
2018-06-08 16:03:51.244292: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-06-08 16:03:57.877822: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212] Found device 0 with properties:
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: bd3f:00:00.0
totalMemory: 11.17GiB freeMemory: 11.10GiB
2018-06-08 16:03:57.878133: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-06-08 16:03:58.172470: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5720 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: bd3f:00:00.0, compute capability: 3.7)
Finished in 11.172759771347046s
Forwarding 8 inputs ...
Total time = 1.121574878692627s / 8 inps = 7.13282737691599 ips
Post processing 8 inputs ...
Total time = 0.1555933952331543s / 8 inps = 51.41606421025856 ips
Check the sample_img/out
folder, which should now contain a series of images along with a .json file for each. This will illustrate the bounding boxes overlayed on top of each image, along with their coordinates stored in the .json file.
json output can be generated with descriptions of the pixel location of each bounding box and the pixel location. Each prediction is stored in the sample_img/out
folder by default. An example json array is shown below.
# Forward all images in sample_img/ using tiny yolo and JSON output.
flow --imgdir sample_img/ --model cfg/tiny-yolo.cfg --load bin/tiny-yolo.weights --json
JSON output:
[{"label":"person", "confidence": 0.56, "topleft": {"x": 184, "y": 101}, "bottomright": {"x": 274, "y": 382}},
{"label": "dog", "confidence": 0.32, "topleft": {"x": 71, "y": 263}, "bottomright": {"x": 193, "y": 353}},
{"label": "horse", "confidence": 0.76, "topleft": {"x": 412, "y": 109}, "bottomright": {"x": 592,"y": 337}}]
- label: self explanatory
- confidence: somewhere between 0 and 1 (how confident yolo is about that detection)
- topleft: pixel coordinate of top left corner of box.
- bottomright: pixel coordinate of bottom right corner of box.
Training is simple as you only have to add option --train
. Training set and annotation will be parsed if this is the first time a new configuration is trained. To point to training set and annotations, use option --dataset
and --annotation
. A few examples:
# Initialize yolo-new from tiny-yolo, then train the net on 100% GPU:
flow --model cfg/yolo-new.cfg --load bin/tiny-yolo.weights --train --gpu 1.0
# Completely initialize yolo-new and train it with ADAM optimizer
flow --model cfg/yolo-new.cfg --train --trainer adam
During training, the script will occasionally save intermediate results into Tensorflow checkpoints, stored in ckpt/
. To resume to any checkpoint before performing training/testing, use --load [checkpoint_num]
option, if checkpoint_num < 0
, darkflow
will load the most recent save by parsing ckpt/checkpoint
.
# Resume the most recent checkpoint for training
flow --train --model cfg/yolo-new.cfg --load -1
# Test with checkpoint at step 1500
flow --model cfg/yolo-new.cfg --load 1500
# Fine tuning tiny-yolo from the original one
flow --train --model cfg/tiny-yolo.cfg --load bin/tiny-yolo.weights
Example of training on Pascal VOC 2007:
# Download the Pascal VOC dataset:
curl -O https://pjreddie.com/media/files/VOCtest_06-Nov-2007.tar
tar xf VOCtest_06-Nov-2007.tar
# An example of the Pascal VOC annotation format:
vim VOCdevkit/VOC2007/Annotations/000001.xml
# Train the net on the Pascal dataset:
flow --model cfg/yolo-new.cfg --train --dataset "~/VOCdevkit/VOC2007/JPEGImages" --annotation "~/VOCdevkit/VOC2007/Annotations"
The steps below assume we want to use tiny YOLO and our dataset has 3 classes
-
Create a copy of the configuration file
tiny-yolo-voc.cfg
and rename it according to your preferencetiny-yolo-voc-3c.cfg
(It is crucial that you leave the originaltiny-yolo-voc.cfg
file unchanged, see below for explanation). -
In
tiny-yolo-voc-3c.cfg
, change classes in the [region] layer (the last layer) to the number of classes you are going to train for. In our case, classes are set to 3.... [region] anchors = 1.08,1.19, 3.42,4.41, 6.63,11.38, 9.42,5.11, 16.62,10.52 bias_match=1 classes=3 coords=4 num=5 softmax=1 ...
-
In
tiny-yolo-voc-3c.cfg
, change filters in the [convolutional] layer (the second to last layer) to num * (classes + 5). In our case, num is 5 and classes are 3 so 5 * (3 + 5) = 40 therefore filters are set to 40.... [convolutional] size=1 stride=1 pad=1 filters=40 activation=linear [region] anchors = 1.08,1.19, 3.42,4.41, 6.63,11.38, 9.42,5.11, 16.62,10.52 ...
-
Change
labels.txt
to include the label(s) you want to train on (number of labels should be the same as the number of classes you set intiny-yolo-voc-3c.cfg
file). In our case,labels.txt
will contain 3 labels.label1 label2 label3
-
Reference the
tiny-yolo-voc-3c.cfg
model when you train.flow --model cfg/tiny-yolo-voc-3c.cfg --load bin/tiny-yolo-voc.weights --train --annotation train/Annotations --dataset train/Images
-
Why should I leave the original
tiny-yolo-voc.cfg
file unchanged?When darkflow sees you are loading
tiny-yolo-voc.weights
it will look fortiny-yolo-voc.cfg
in your cfg/ folder and compare that configuration file to the new one you have set with--model cfg/tiny-yolo-voc-3c.cfg
. In this case, every layer will have the same exact number of weights except for the last two, so it will load the weights into all layers up to the last two because they now contain different number of weights.
For a demo that entirely runs on the CPU:
flow --model cfg/yolo-new.cfg --load bin/yolo-new.weights --demo videofile.avi
For a demo that runs 100% on the GPU:
flow --model cfg/yolo-new.cfg --load bin/yolo-new.weights --demo videofile.avi --gpu 1.0
To use your webcam/camera, simply replace videofile.avi
with keyword camera
.
To save a video with predicted bounding box, add --saveVideo
option.
Please note that return_predict(img)
must take an numpy.ndarray
. Your image must be loaded beforehand and passed to return_predict(img)
. Passing the file path won't work.
Result from return_predict(img)
will be a list of dictionaries representing each detected object's values in the same format as the JSON output listed above.
from darkflow.net.build import TFNet
import cv2
options = {"model": "cfg/yolo.cfg", "load": "bin/yolo.weights", "threshold": 0.1}
tfnet = TFNet(options)
imgcv = cv2.imread("./sample_img/sample_dog.jpg")
result = tfnet.return_predict(imgcv)
print(result)
## Saving the lastest checkpoint to protobuf file
flow --model cfg/yolo-new.cfg --load -1 --savepb
## Saving graph and weights to protobuf file
flow --model cfg/yolo.cfg --load bin/yolo.weights --savepb
When saving the .pb
file, a .meta
file will also be generated alongside it. This .meta
file is a JSON dump of everything in the meta
dictionary that contains information nessecary for post-processing such as anchors
and labels
. This way, everything you need to make predictions from the graph and do post processing is contained in those two files - no need to have the .cfg
or any labels file tagging along.
The created .pb
file can be used to migrate the graph to mobile devices (JAVA / C++ / Objective-C++). The name of input tensor and output tensor are respectively 'input'
and 'output'
. For further usage of this protobuf file, please refer to the official documentation of Tensorflow
on C++ API here. To run it on, say, iOS application, simply add the file to Bundle Resources and update the path to this file inside source code.
Also, darkflow supports loading from a .pb
and .meta
file for generating predictions (instead of loading from a .cfg
and checkpoint or .weights
).
## Forward images in sample_img for predictions based on protobuf file
flow --pbLoad built_graph/yolo.pb --metaLoad built_graph/yolo.meta --imgdir sample_img/
If you'd like to load a .pb
and .meta
file when using return_predict()
you can set the "pbLoad"
and "metaLoad"
options in place of the "model"
and "load"
options you would normally set.
That's all.