This C++ application allows you to perform machine learning tasks, such as computer vision tasks like object detection or classification, using Nvidia Triton Server to manage multiple framework backends. It currently supports object detection models such as YOLOv5, YOLOv6, YOLOv7, YOLOv8, YOLOv9, YOLOV10, and YOLO-NAS, inference for classification models from the Torchvision API.
To build the client libraries, please refer to the Triton Inference Server client libraries located here.
Ensure that you have the following dependencies installed:
- Nvidia Triton Inference Server container pulled from NGC (
docker pull nvcr.io/nvidia/tritonserver:23.08-py3
). - Triton client libraries (Tested Release 23.08).
- Protobuf and gRPC++ (versions compatible with Triton Server).
- RapidJSON (
apt install rapidjson-dev
). - libcurl (
apt install libcurl4-openssl-dev
). - OpenCV 4 (Tested version: 4.7.0).
Follow these steps to build and compile the application:
-
Set the environment variable
TritonClientBuild_DIR
(path/to/client/build/install), or link to the folder where you have installed Triton client libraries, or Triton client libraries directly and editCMakeLists
accordingly. -
Create a build directory:
mkdir build
. -
Navigate to the build directory:
cd build
. -
Run CMake to configure the build:
cmake -DCMAKE_BUILD_TYPE=Release ..
Optional flags:
-DSHOW_FRAME
: Enable to show processed frames after inference.-DWRITE_FRAME
: Enable to write processed frames after inference.
-
Build the application:
cmake --build .
- Object Detection
- Classification
- TODO Instance Segmentation, PoseEstimation...
- Ensure that the versions of libraries used for exporting models match the versions supported in the Triton release you are using. Check Triton Server releases here.
To deploy the models to Triton, set up a model repository folder following the Triton Model Repository schema. Usually, the config.pbtxt
file is optional unless you use the OpenVino backend.
Example repository structure:
<model_repository>
->
<model_name>
->
[config.pbtx]
<model_version>
->
<model_binary>
Then, run the Triton server:
#!/bin/bash
$ docker run --gpus=1 --rm \
-p8000:8000 -p8001:8001 -p8002:8002 \
-v/full/path/to/docs/examples/model_repository:/models \
nvcr.io/nvidia/tritonserver:<xx.yy>-py3 tritonserver \
--model-repository=/models
If you plan to run on CPU, omit the --gpus
parameter.
For more information and examples, refer to the Triton Inference Server tutorials.
By using the --source
parameter with the path to either a video or an image, you can perform computer vision tasks on your chosen input type. Follow these instructions:
./computer-vision-triton-cpp-client \
--source=/path/to/source.format \
--task_type=<task_type> \
--model_type=<model_type> \
--model=<model_name_folder_on_triton> \
--labelsFile=/path/to/labels/coco.names \
--protocol=<http or grpc> \
--serverAddress=<triton-ip> \
--port=<8000 for http, 8001 for grpc>
Add input sizes if your model has dynamic axis
--input_sizes="c w h"
Replace the following placeholders:
/path/to/source.format
: The path to your video or image file.<task_type>
: Choose the computer vision task type (e.g.,detection
orclassification
).<model_type>
: Specify the model type (e.g., one of detectors:yolov5
,yolov6
,yolov7
,yolov8
,yolov9
,yolov10
,yolonas
or classification models:torchvision-classifier
).<model_name_folder_on_triton>
: The name of the model folder in the Triton server where your chosen model is deployed./path/to/labels/coco.names
: The path to the file containing label names (e.g., COCO labels).<http or grpc>
: Choose eitherhttp
orgrpc
as the protocol based on your Triton server setup.<triton-ip>
: The IP address of your Triton server.<8000 for http, 8001 for grpc>
: The port number, which is usually8000
for HTTP or8001
for gRPC, depending on your Triton server configuration.
Use ./computer-vision-triton-cpp-client --help
to view all available parameters.
docker build --rm -t computer-vision-triton-cpp-client .
This command will create a Docker image based on the provided Dockerfile.
Replace the placeholders with your desired options and paths:
docker run --rm \
-v /path/to/host/data:/app/data \
computer-vision-triton-cpp-client \
--network host \
--source=/app/data/source.format \
--task_type=<task_type> \
--model_type=<model_type> \
--model=<model_name_folder_on_triton> \
--labelsFile=/app/coco.names \
--protocol=<http or grpc> \
--serverAddress=<triton-ip> \
--port<8000 for http, 8001 for grpc>
-v /path/to/host/data:/app/data
: Map a host directory to/app/data
inside the container, allowing you to access input and output data.- Adjust the rest of the parameters to match your specific setup.
The program will process the specified video or image based on your options. You can find the processed output in the /path/to/host/data
directory on your host machine.
Watch a test of YOLOv7-tiny exported to ONNX here.
- Triton Inference Server Client Example
- Triton user guide
- ONNX models
- Torchvision models
- Tensorflow model garden
- Any feedback is greatly appreciated, if you have any suggestions, bug reports or questions don't hesitate to open an issue.