Failed to build onnxruntime with TensorRT on Windows 10

Question

Failed to build onnxruntime with TensorRT on Windows 10

Closed this issue 4 years ago · 11 comments

cocoyen1995 commented 4 years ago

Describe the bug
I'm trying to build onnxruntime with tensorRT, and I'm getting errors as I will show.
(Similar to this issue)

Urgency
none
(But I've been working on this issue for about a week, but still can't figure it out....)

System information

OS Platform and Distribution: Windows10
ONNX Runtime installed from (source or binary): source
ONNX Runtime version: 1.4.0
Python version: 3.7.0
Visual Studio version (if applicable): VS2019 Community 16.6.5
GCC/Compiler version (if compiling from source): -
CUDA/cuDNN version: 10.2/ 7.6.5
TensorRT version: 7.0.0.11 with CUDA10.2
GPU model and memory: Nvidia RTX 2080 Ti 11G

To Reproduce
Go to onnxruntime's directory, open terminal and type command as below:
build.bat --config Release --parallel --build_shared_lib --use_cuda --cuda_version 10.2 --cudnn_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2" --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2" --use_tensorrt --tensorrt_home "D:\Coco\Libs\TensorRT-7.0.0.11_cuda10.2" --cmake_generator "Visual Studio 16 2019"

Expected behavior
Finish the build successfully.

Screenshots

Additional context
I've tried to build with CUDA10.0 succefully without build with tensorRT with the command below in build-in terminal:
build.bat --config Release --build_shared_lib --parallel --use_cuda --cuda_version 10.0 --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0" --cudnn_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0" --cmake_generator "Visual Studio 15 2017"
(The building also failed if with tensorRT)

Due to the usage of TensorRT, I can build onnxruntime with CUDA10.2 with VS2019 by similar command mentioned above.
But I cannot build successfully if it's with TensorRT. I also tried to clone the repo with --recursive, but the error seems the same...
Here's some log file during the build:
log_with_trt_vs2019.txt

Thanks in advance for any help!

Answer 1 · 2020-08-10T18:44:20.000Z

@stevenlix - could you please take a look ? Thanks.

Answer 2 · 2020-08-18T01:32:39.000Z

Hi guys,

I saw there's some update for the repository, so I've tried to clone it and tried to build with TRT again, but I still failed...
I've tried to run the whole process on another computer with new CUDA driver and CUDA10.2 for this case, but it didn't work either.

The whole process I ran on another computer is like this:

Update driver with the support of CUDA10.2
Install VS2019 community
Install CUDA10.2 + cudnn7.6.5
Download cmake 3.16.4
Download TensorRT7.0.0.11 with CUDA10.2
Run
git clone --recursive https://github.com/Microsoft/onnxruntime
cd onnxruntime
build.bat --config Release --parallel --build_shared_lib --cudnn_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2" --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2" --use_tensorrt --tensorrt_home "D:\Coco\Libs\TensorRT-7.0.0.11_cuda10.2" --cmake_generator "Visual Studio 16 2019"

And it seems like the error message is the same as the old one,
but the last built file changed from onnxruntime_providers_cuda.lib to onnxruntime_perf_test.exe
(compare with the last build I mentioned above)
The result I built on the original computer is same as I built on another computer,
and here's a log file of the building process:
log_with_trt_vs2019_v2.txt

p.s The info of the computer I ran the whole process again is:

OS Platform and Distribution: Windows10
ONNX Runtime installed from (source or binary): source
ONNX Runtime version: 1.4.0
Python version: 3.6.8
Visual Studio version (if applicable): VS2019 Community 16.7.1
GCC/Compiler version (if compiling from source): -
CUDA/cuDNN version: 10.2/ 7.6.5
TensorRT version: 7.0.0.11 with CUDA10.2
GPU model and memory: Nvidia RTX 2060

I also have tried to build with VS2017, but still get the same error...Orz
The build also can be built without TensorRT successfully...

Thanks in advance for any help again!

Answer 3 · 2020-08-18T03:05:08.000Z

+@stevenlix we need to update our Windows build instructions after recent update to TensorRT 7.1.x the instructions still mention 7.0
@cocoyen1995 Can you try building the latest onnxruntime master with TensorRT 7.1 and CUDA 11.0?

Answer 4 · 2020-08-18T03:14:57.000Z

@jywu-msft Thanks for your quick reply!
Sure, I'll try it out and let you know the result.
One more question here, is there any specific version of cuDNN should I use with CUDA 11.0 in the build?

Answer 5 · 2020-08-18T03:27:44.000Z

@jywu-msft Thanks for your quick reply!
Sure, I'll try it out and let you know the result.
One more question here, is there any specific version of cuDNN should I use with CUDA 11.0 in the build?

It should match the version TensorRT is built with.
In this case, it should be cuDNN 8.0 because that is the version TensorRT 7.1 for Windows is built with.

Answer 6 · 2020-08-18T04:16:06.000Z

Hi @jywu-msft ,

I've done all the update, but the result is still the same as the error I posted above.

The command I enter is:
build.bat --config Release --parallel --build_shared_lib --cudnn_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0" --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0" --use_tensorrt --tensorrt_home "D:\Coco\Libs\TensorRT-7.1.3.4" --cmake_generator "Visual Studio 16 2019"

Here's the log during the build process...
log with_trt_vs2019_cuda11.txt

Other details is listed below:

OS Platform and Distribution: Windows10
ONNX Runtime installed from (source or binary): source
ONNX Runtime version: 1.4.0
Python version: 3.7.0
Visual Studio version (if applicable): VS2019 Community 16.6.5
GCC/Compiler version (if compiling from source): -
CUDA/cuDNN version: 11.0/8.0.2
TensorRT version: 7.1.3.4
GPU model and memory: Nvidia RTX 2080 Ti 11G

I'm wondering if I miss something with the installation of TensorRT?
I only download the zip file and unzip it under the path...
Otherwise, the installation of CUDA may not be the problem(?)
(Since I've checked with nvcc -V and it shows the info as I expected)

Answer 7 · 2020-08-18T04:58:13.000Z

i'm not sure what the issue is. we have had partners recently build master and tested successfully on Nvidia RTX 2080.
can you clean your build directory (removing any cmake cache as well)
and build without --parallel and --build_shared_lib options.
i.e.
build with
build.bat --config Release --cudnn_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0" --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0" --use_tensorrt --tensorrt_home "D:\Coco\Libs\TensorRT-7.1.3.4" --cmake_generator "Visual Studio 16 2019"

Answer 8 · 2020-08-18T06:34:21.000Z

Hi,
I've tried to clone the whole repo again and build with the command you provided, but still failed...
Here's the error message and log file

log with_trt_vs2019_cuda11_nosharelib.txt
The error message is different from the ones I've seen before.
Could that be the version of cmake or GPU driver cause the failure? :(
My cmake version is 3.17.4,
and my driver version is 451.82...
(or the python's version?

Answer 9 · 2020-08-18T07:54:26.000Z

no, it's not related to your gpu driver or cmake version.

D:\Coco\Libs\TensorRT-7.1.3.4\include\NvInferRuntime.h(1,1): warning C4819: 檔案含有無法在目前字碼頁 (950) 中表示的字元。請以 Unicode 格式儲存檔案
以防止資料遺失 (正在編譯原始程式檔 D:\Coco\Libs\onnxruntime_new2\onnxruntime\cmake\external\onnx-tensorrt\builtin_op_importers.cpp) [D:
\Coco\Libs\onnxruntime_new2\onnxruntime\build\Windows\Release\external\onnx-tensorrt\nvonnxparser_static.vcxproj]

there may be an issue with some of the source files in tensorrt re: to unicode.

can you change your system locale to English like https://stackoverflow.com/questions/27544958/warning-c4819-in-visual-studio-c-2013-express-utf8-files-without-bom/37871883#37871883

Answer 10 · 2020-08-18T11:02:18.000Z

Hi,

After changing my locale language from Chinese(Traditional) to English(U.S), the build FINALLY PASS!!
One more thing I've done is checking nvidia's page with the installation,
and move the dlls under TensorRT/lib to CUDAx.x/bin
(Although it seems doesn't help with the original case...)

I've tried to build with CUDA11.0 + cuDNN8.0.2 + TensorRT 7.1.3.4 with command:
build.bat --config Release --parallel --build_shared_lib --cudnn_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0" --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0" --use_tensorrt --tensorrt_home "D:\Coco\Libs\TensorRT-7.1.3.4" --cmake_generator "Visual Studio 16 2019"
This can build successfully.

I also tried to build with CUDA10.2 + cuDNN7.6.5 + TensorRT 7.0.0.11 again with command
build.bat --config Release --use_cuda --cuda_version 10.2 --cudnn_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2" --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2" --use_tensorrt --tensorrt_home "D:\Coco\Libs\TensorRT-7.0.0.11_cuda10.2" --cmake_generator "Visual Studio 16 2019"
If I didn't specify cuda version on the command line, it only get CUDA 11
(even if nvcc -V shows CUDA10.2 after setting system variables)
This build still failed at the same place.
Here's the log during the building process:
log_with_trt_vs2019_locale_eng_noshare.txt

Anyway, big thanks for your patience and help!
Have a nice day!^^

Answer 11 · 2020-08-20T06:33:00.000Z

Glad it is working now!
building from the latest master requires TensorRT 7.1 now. TensorRT 7.0 will no longer build.