spacewalk01/tensorrt-yolov9

How to inference on multiple gpus?

fungtion opened this issue · 5 comments

Hi, can engine model perform inference on multiple gpus?

Hi,

From the FAQ: https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#faq

Q: How do I use TensorRT on multiple GPUs?

A: Each ICudaEngine object is bound to a specific GPU when it is instantiated, either by the builder or on deserialization. To select the GPU, use cudaSetDevice() before calling the builder or deserializing the engine. Each IExecutionContext is bound to the same GPU as the engine from which it was created. When calling execute() or enqueue(), ensure that the thread is associated with the correct device by calling cudaSetDevice() if necessary.

from: NVIDIA/TensorRT#322

Create an inference engine instance for each gpu and set cudaSetDevice(gpu_id) for each device.

Can you give an example? I tried to set cudaSetDevice(1), but it always works on gpu:0 not gpu:1

Make sure to put it at the beginning of each function that is using cuda/gpu such as:

Yolov9::Yolov9(string engine_path)
{
cudaSetDevice(1);
// Read the engine file
ifstream engineStream(engine_path, ios::binary);
...
void Yolov9::predict(Mat& image, vector<Detection> &output)
{
cudaSetDevice(1);
// Preprocessing data on gpu
cuda_preprocess(image.ptr(), image.cols, image.rows, gpu_buffers[0], model_input_w, model_input_h, cuda_stream);
...

Thanks, I will try it.