omarabid59/yolov8-triton

Provides an ensemble model to deploy a YoloV8 ONNX model to Triton

PythonApache-2.0

Overview

This repository provides an ensemble model to combine a YoloV8 model exported from the Ultralytics repository with NMS post-processing. The NMS post-processing code contained in models/postprocess/1/model.py is adapted from the Ultralytics ONNX Example.

For more information about Triton's Ensemble Models, see their documentation on Architecture.md and some of their preprocessing examples.

Directory Structure

models/
    yolov8_onnx/
        1/
            model.onnx
        config.pbtxt
        
    postprocess/
        1/
            model.py
        config.pbtxt
        
    yolov8_ensemble/
        1/
            <Empty Directory>
        config.pbtxt
README.md
main.py

Quick Start

Install Ultralytics and TritonClient

pip install ultralytics==8.0.51 tritonclient[all]==2.31.0

Export a model to ONNX format:

yolo export model=yolov8n.pt format=onnx dynamic=True opset=16

Rename the model file to model.onnx and place it under the /models/yolov8_onnx/1 directory (see directory structure above).
(Optional): Update the Score and NMS threshold in models/postprocess/1/model.py
(Optional): Update the models/yolov8_ensemble/config.pbtxt file if your input resolution has changed.
Build the Docker Container for Triton Inference:

DOCKER_NAME="yolov8-triton"
docker build -t $DOCKER_NAME .

Run Triton Inference Server:

DOCKER_NAME="yolov8-triton"
docker run --gpus all \
    -it --rm \
    --net=host \
    -v ./models:/models \
    $DOCKER_NAME

Run the script with python main.py. The overlay image will be written to output.jpg.