This repository provides an ensemble model to combine a YoloV8 model exported from the Ultralytics repository with NMS post-processing. The NMS post-processing code contained in models/postprocess/1/model.py is adapted from the Ultralytics ONNX Example.
For more information about Triton's Ensemble Models, see their documentation on Architecture.md and some of their preprocessing examples.
models/
yolov8_onnx/
1/
model.onnx
config.pbtxt
postprocess/
1/
model.py
config.pbtxt
yolov8_ensemble/
1/
<Empty Directory>
config.pbtxt
README.md
main.py
- Install Ultralytics and TritonClient
pip install ultralytics==8.0.51 tritonclient[all]==2.31.0
- Export a model to ONNX format:
yolo export model=yolov8n.pt format=onnx dynamic=True opset=16
-
Rename the model file to
model.onnx
and place it under the/models/yolov8_onnx/1
directory (see directory structure above). -
(Optional): Update the Score and NMS threshold in models/postprocess/1/model.py
-
(Optional): Update the models/yolov8_ensemble/config.pbtxt file if your input resolution has changed.
-
Build the Docker Container for Triton Inference:
DOCKER_NAME="yolov8-triton"
docker build -t $DOCKER_NAME .
- Run Triton Inference Server:
DOCKER_NAME="yolov8-triton"
docker run --gpus all \
-it --rm \
--net=host \
-v ./models:/models \
$DOCKER_NAME
- Run the script with
python main.py
. The overlay image will be written tooutput.jpg
.