/yolov8-triton

Provides an ensemble model to deploy a YoloV8 ONNX model to Triton

Primary LanguagePythonApache License 2.0Apache-2.0

Overview

This repository provides an ensemble model to combine a YoloV8 model exported from the Ultralytics repository with NMS post-processing. The NMS post-processing code contained in models/postprocess/1/model.py is adapted from the Ultralytics ONNX Example.

For more information about Triton's Ensemble Models, see their documentation on Architecture.md and some of their preprocessing examples.

Directory Structure

models/
    yolov8_onnx/
        1/
            model.onnx
        config.pbtxt
        
    postprocess/
        1/
            model.py
        config.pbtxt
        
    yolov8_ensemble/
        1/
            <Empty Directory>
        config.pbtxt
README.md
main.py

Quick Start

  1. Install Ultralytics and TritonClient
pip install ultralytics==8.0.51 tritonclient[all]==2.31.0
  1. Export a model to ONNX format:
yolo export model=yolov8n.pt format=onnx dynamic=True opset=16
  1. Rename the model file to model.onnx and place it under the /models/yolov8_onnx/1 directory (see directory structure above).

  2. (Optional): Update the Score and NMS threshold in models/postprocess/1/model.py

  3. (Optional): Update the models/yolov8_ensemble/config.pbtxt file if your input resolution has changed.

  4. Build the Docker Container for Triton Inference:

DOCKER_NAME="yolov8-triton"
docker build -t $DOCKER_NAME .
  1. Run Triton Inference Server:
DOCKER_NAME="yolov8-triton"
docker run --gpus all \
    -it --rm \
    --net=host \
    -v ./models:/models \
    $DOCKER_NAME
  1. Run the script with python main.py. The overlay image will be written to output.jpg.