/generalized_yolov5

An extension of YOLOv5 to non-natural images together with 5-Fold Cross-Validation

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

Generalized-YOLOv5 is a modified version of YOLOv5. Two main contributions have been made to Generalized-YOLOv5. First, an extension to train on non-natural images and second a cross-validation extension.

Non-Natural image extension

The non-natural image extension enables YOLOv5 to handle images with arbitrary intensity scales, so it can be trained on 2D non-natural images. Furthermore, this extension also includes preprocessing scripts to convert non-natural image datasets to a natural intensity scale. Training on natural images is faster as OpenCV, which is highly optimized on natural images, can be used during training for preprocessing and augmentation. However, this can lead in some scenarios to a decreased performance as the intensity scale is often reduced to a fraction of its original scale. The slow-down on non-natural images can be somewhat mitigated by increasing worker threads (until they block each other), but one should expect an increased training time of 1.5x.

N-fold cross-validation

The cross-validation extension is a dataset preprocessing script that processes a dataset into N folds with corresponding dataset configuration files that are understood by YOLOv5. On each fold a YOLOv5 model can be trained and YOLOv5's built-in ensembling method can be used to run inference with the models of all folds. The built-in ensembling method itself has no cross-validation functionality and was only designed for ad-hoc ensembling of YOLOv5 models of different scales, thus the preprocessing script is required for cross-validation.

Documentation

See the documentation and usage of the Generalized-YOLOv5 in the Generalized-YOLOv5 README. The official YOLOv5 documentation can be found in the YOLOv5 README.

Quick Start Examples

Install

Clone repo and install requirements.txt in a Python>=3.7.0 environment, including PyTorch>=1.7.

git clone https://github.com/MIC-DKFZ/generalized_yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install
Inference

YOLOv5 PyTorch Hub inference. Models download automatically from the latest YOLOv5 release.

import torch

# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')  # or yolov5n - yolov5x6, custom

# Images
img = 'https://ultralytics.com/images/zidane.jpg'  # or file, Path, PIL, OpenCV, numpy, list

# Inference
results = model(img)

# Results
results.print()  # or .show(), .save(), .crop(), .pandas(), etc.
Inference with detect.py

detect.py runs inference on a variety of sources, downloading models automatically from the latest YOLOv5 release and saving results to runs/detect.

python detect.py --source 0  # webcam
                          img.jpg  # image
                          vid.mp4  # video
                          path/  # directory
                          'path/*.jpg'  # glob
                          'https://youtu.be/Zgi9g1ksQHc'  # YouTube
                          'rtsp://example.com/media.mp4'  # RTSP, RTMP, HTTP stream
Training

The commands below reproduce YOLOv5 COCO results. Models and datasets download automatically from the latest YOLOv5 release. Training times for YOLOv5n/s/m/l/x are 1/2/4/6/8 days on a V100 GPU (Multi-GPU times faster). Use the largest --batch-size possible, or pass --batch-size -1 for YOLOv5 AutoBatch. Batch sizes shown for V100-16GB.

python train.py --data coco.yaml --cfg yolov5n.yaml --weights '' --batch-size 128
                                       yolov5s                                64
                                       yolov5m                                40
                                       yolov5l                                24
                                       yolov5x                                16
Tutorials

Contact

For Generalized-YOLOv5 bugs and feature requests please visit GitHub Issues. For business inquiries or professional support requests please visit https://helmholtz-imaging.de/contact.