NVIDIA/jitify

Debugging NVRTC_ERROR_INVALID_OPTION Windows

mattpaletta opened this issue · 5 comments

Hi there,

I am trying to integrate this into a project that I'm building using CMake on Windows. As a first step, I tried taking one of your example kernels and seeing if I can compile it. I keep getting this NVRTC_ERROR_INVALID_OPTION exception being thrown. I have tried to #define JITIFY_PRINT_ALL 1, but it doesn't seem to be printing except for the compiler options before it crashes. I am building on Windows 10 with CUDA 11.0 and Visual Studio 2019, Community Edition. I have the CUDA compiler set to C++14 and my CXX compiler set to C++17. I have also tried to jump through it with the Visual Studio debugger, however, it skips over the code that does the compilation, jumps straight to where it prints the source code (but there's no console output), and then it aborts the load_program function.

I made sure to link my code with the cuda, cudart, and nvrtc libraries through CMake.

Any advice on how to go about debugging this issue?
Thanks

Compiler options: -std=c++14 -arch=compute_30
due to unexpected exception with message:
  NVRTC error: NVRTC_ERROR_INVALID_OPTION
thread_local static jitify::JitCache kernel_cache;
const char* program_source =
                "template<int N, typename T>\n"
                "__global__\n"
                "void my_kernel(T* data) {\n"
                "    T data0 = data[0];\n"
                "    for( int i=0; i<N-1; ++i ) {\n"
                "        data[0] *= data0;\n"
                "    }\n"
                "}\n";
jitify::Program program = kernel_cache.program(program_source, 0, { "-std=c++14",});

I think it's supposed to be --std=c++14

(Or atleast, try the long form, iirc we may have had an issue with the short form of some options).

That's how we use it here, in a cross platform CMake project that makes use of Jitify.

Refer to these docs, RTC compiler flags are a subset of those for standard nvcc.
https://docs.nvidia.com/cuda/nvrtc/index.html#group__options

Thanks for the response, I tried it with --std=c++14 and without that parameter, both result in the same error. I will have a look at those docs.

I've just tested this small Jitify test rig I had kicking around, by only passing the options -std=c++14 and -arch=compute_30. It works under both Visual Studio 2015 and 2019, with CUDA 10.1 and the latest jitify header from master. I'm unable to reproduce your problem, it runs for me and gives the expected output (1080GTX if that matters).

Looking at the code you've posted, you use a slightly different function (JitCache::program()), but checking it's implementation it calls the same code.

So it's possibly worth looking elsewhere at the blame, I've not seen this particular error before and most results on google suggest particular device/driver/nvrtc versions being to blame. Hopefully a jitify dev knows more.

The issue is that compute_30 is no longer supported as of CUDA 11.0 [1]. Unfortunately the NVRTC docs have not yet been updated to reflect this (this will be fixed in the next release).

The -arch flag is added automatically by Jitify based on the architecture of the GPU you are using. You could try adding -arch=compute_35, but it likely still won't run.

[1] https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#deprecated-features

@Robadob Thanks so much for investigating and sending over the known code sample.
@benbarsdell Thanks, I reverted to CUDA 10.2 and that resolved my issue.

Both of your input much appreciated I will close this now.