This is the PyTorch implementation of YOLOv3, with support for training, inference and evaluation. Some parts of the code has been taken from this repository.
The model is trained by making use of transfer learning i.e., a feature extractor network pretrained on 1000 class Imagenet dataset has been used, the weights of which have been provided by the author's of YOLOv3 and can be downloaded using the following link:
https://pjreddie.com/media/files/darknet53.conv.74
Run the commands below to create a custom model definition, replacing num_classes with the number of classes in your dataset.
$ cd config
$ bash create_custom_model.sh num_classes
Add class names to data/classes.names
. This file should have one class name per row.
Move the images of your dataset to data/images/
folder.
Move the annotations of your dataset to data/labels/
folder. The dataloader in this repository expects that the annotation file corresponding to the image data/images/train.jpg
has the path data/labels/train.txt
. Each row in the annotation file should define one bounding box, using the syntax label_id x_center y_center width height
. The coordinates should be scaled to [0, 1]
, and the label_id should be zero-indexed and correspond to the row number of the class name in data/classes.names
In data/train.txt
and data/valid.txt
, add paths to images that will be used as train and validation data respectively.
usage: train.py [-h] [--epochs EPOCHS] [--batch_size BATCH_SIZE]
[--network_config NETWORK_CONFIG]
[--use_pretrained_weights USE_PRETRAINED_WEIGHTS]
[--pretrained_weights PRETRAINED_WEIGHTS]
[--use_pretrained_backbone USE_PRETRAINED_BACKBONE]
[--pretrained_backbone PRETRAINED_BACKBONE] [--n_cpu N_CPU]
[--inp_img_size INP_IMG_SIZE]
[--multiscale_training MULTISCALE_TRAINING]
For a detailed description of the arguments, kindly refer to the train.py
file.
In order to track training progress in tensorboard:
- Initiate Training and switch to the directory where code is present.
- Run the command:
tensorboard --logdir logs
- Go the browser and type:
http://localhost:6006/
usage: detect.py [-h] [--batch_size BATCH_SIZE] [--image_folder IMAGE_FOLDER]
[--text_file_path TEXT_FILE_PATH]
[--network_config NETWORK_CONFIG]
[--weights_path WEIGHTS_PATH] [--class_path CLASS_PATH]
[--conf_thresh CONF_THRESH] [--nms_thresh NMS_THRESH]
[--n_cpu N_CPU] [--inp_img_size INP_IMG_SIZE]
[--display DISPLAY]
For a detailed description of the arguments, kindly refer to the detect.py
file.
usage: test.py [-h] [--batch_size BATCH_SIZE]
[--network_config NETWORK_CONFIG] [--weights_path WEIGHTS_PATH]
[--iou_thresh IOU_THRESH] [--conf_thresh CONF_THRESH]
[--nms_thresh NMS_THRESH] [--n_cpu N_CPU]
[--inp_img_size INP_IMG_SIZE]
For a detailed description of the arguments, kindly refer to the test.py
file.
The files present in the folder data_preparation/
are used for dataset preparation (in particular, MOT17Det and MOT20Det).
The file make_video.py
is used to make a video out of the detections obtained.
Joseph Redmon, Ali Farhadi
We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that’s pretty swell. It’s a little bigger than last time but more accurate. It’s still fast though, don’t worry. At 320 × 320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good. It achieves 57.9 AP50 in 51 ms on a Titan X, compared to 57.5 AP50 in 198 ms by RetinaNet, similar performance but 3.8× faster. As always, all the code is online at
https://pjreddie.com/darknet/yolo/
[Paper]
[Project Webpage]
[Author's Implementation]
@article{yolov3, title={YOLOv3: An Incremental Improvement}, author={Redmon, Joseph and Farhadi, Ali}, journal = {arXiv}, year={2018} }