NVIDIA/jitify

Cannot use `<limits>` and `<cuda/std/limits>` in the same source file

shwina opened this issue · 4 comments

Invoking jitify with the following source file:

#include <limits>
#include <cuda/std/limits>

as follows:

jitify2_preprocess -std=c++11 -D__CUDACC_RTC__ test.hpp

results in:

Error processing source file test.hpp
Compilation failed: NVRTC_ERROR_COMPILATION
Compiler options: "-std=c++11 -D__CUDACC_RTC__ -include=jitify_preinclude.h -default-device"
detail/libcxx/include/limits(211): error: identifier "__CHAR_BIT__" is undefined

detail/libcxx/include/limits(312): error: identifier "__FLT_MANT_DIG__" is undefined

detail/libcxx/include/limits(313): error: identifier "__FLT_DIG__" is undefined

detail/libcxx/include/limits(321): error: identifier "__FLT_RADIX__" is undefined

detail/libcxx/include/limits(325): error: identifier "__FLT_MIN_EXP__" is undefined

<many more similar errors>

As a workaround I can do:

include <limits>
#include <cuda/std/climits>
#include <cuda/std/limits>

@benbarsdell this is the same issue I reported a while ago. Did you have a chance to think about how to fix this?

I'll see if I can take another look at this later this week.

bdice commented

@benbarsdell Hi, any updates on this? I'm reviewing rapidsai/cudf#11287 and would like to understand the issue / what solutions might be possible.

I believe the root cause of this is the #include <climits> header being loaded from jitify's builtins and cached, and then, when #include "climits" is encountered within libcu++, jitify uses the cached version instead of the new one.

The solution will be to distinguish between #include <foo> and #include "foo" in the header cache. However, it is further complicated by the fact that NVRTC does not support such a distinction. I think the only way around that will be to automatically patch #include "foo" to #include </path/to/foo> (if and only if /path/to/foo exists).

Unfortunately this is easier said than done, which is why I haven't got to it yet.

In terms of workarounds, removing #include <limits> and just using the libcu++ version should work, if that's doable in your code. There may be other workarounds too.