IBM/aihwkit

CUDA 11 build compatibility

diego-plan9 opened this issue · 5 comments

Description and motivation

It seems that CUB is included along the CUDA Toolkit since version 11, which can cause issues during build (thanks @chaeunl for the valuable feedback and troubleshooting!):

$ python setup.py install -DUSE_CUDA=ON -DRPU_CUDA_ARCHITECTURES="80"
[ 14%] Built target cub
[ 50%] Built target RPU_CPU
[ 51%] Building CUDA object CMakeFiles/RPU_GPU.dir/src/rpucuda/cuda/bit_line_maker.cu.o
In file included from /usr/local/cuda-11.1/targets/x86_64-linux/include/thrust/system/cuda/detail/execution_policy.h:33:0,
                 from /usr/local/cuda-11.1/targets/x86_64-linux/include/thrust/iterator/detail/device_system_tag.h:23,
                 from /usr/local/cuda-11.1/targets/x86_64-linux/include/thrust/iterator/detail/iterator_facade_category.h:22,
                 from /usr/local/cuda-11.1/targets/x86_64-linux/include/thrust/iterator/iterator_facade.h:37,
                 from /.../aihwkit/_skbuild/linux-x86_64-3.8/cmake-build/cub-prefix/src/cub/cub/iterator/arg_index_input_iterator.cuh:48,
                 from /.../aihwkit/_skbuild/linux-x86_64-3.8/cmake-build/cub-prefix/src/cub/cub/device/device_reduce.cuh:41,
                 from /.../aihwkit/_skbuild/linux-x86_64-3.8/cmake-build/cub-prefix/src/cub/cub/cub.cuh:53,
                 from /.../aihwkit/src/rpucuda/cuda/bit_line_maker.cu:24:
/usr/local/cuda-11.1/targets/x86_64-linux/include/thrust/system/cuda/config.h:78:2: error: #error The version of CUB in your include path is not compatible with this release of Thrust. CUB is now included in the CUDA Toolkit, so you no longer need to use your own checkout of CUB. Define THRUST_IGNORE_CUB_VERSION_CHECK to ignore this.
#error The version of CUB in your include path is not compatible with this release of Thrust. CUB is now included in the CUDA Toolkit, so you no longer need to use your own checkout of CUB. Define THRUST_IGNORE_CUB_VERSION_CHECK to ignore this.

Proposed solution

We should revise the using of CUB in the build system. Currently, we make an attempt to find it, and if not possible, we automatically download and include the package. This might just not be needed entirely for cuda 11 (as it might be included in the default cuda header paths), or the THRUST_IGNORE_CUB_VERSION_CHECK flag might allow for bypass the check and use the downloaded version (which might not be ideal, though).

Alternatives and other information

It should be possible to compile under both Linux and Windows with CUDA 11 after #68 - please comment or reopen for any follow-ups.

How does one "fix" this?

This is the error I get:

/usr/lib/cuda-11.2/include/thrust/system/cuda/config.h:78:2: error: #error The version of CUB in your include path is not compatible with this release of Thrust. CUB is now included in the CUDA Toolkit, so you no longer need to use your own checkout of CUB. Define THRUST_IGNORE_CUB_VERSION_CHECK to ignore this.

And I have no idea how to "Define THRUST_IGNORE_CUB_VERSION_CHECK" or where I should "Define" it.

Can someone please explain it to me as if I was 5 years old? TIA

Hi:
In CUDA 11 CUB is indeed included and the compilation should use that included CUB automatically. If AIHWKIT cannot find CUDA 11 (eg it finds CUDA 10) then it will download the CUB library. It might be that you have tried to compile AIHWKIT when it could not find CUDA 11 (or could not determine the CUDA version), which then caused to download the CUB library. When one now tries to compile it on CUDA 11 it would confuse the include path with the downloaded CUB.

In this case, it would be fixed by deleting the folder, cloning the AIHWKIT repository again and compile from a clean state. If you have CUDA 11 correctly installed, then it should find the CUB library that is shipped with CUDA 11 automatically.

If the problem persist, it would be helpful if you could expand a bit on how you have tried to compile AIHWKIT, what operating system you are using and so on.

I was able to fix it (for NVBio) using this guide here: https://githubmemory.com/repo/NVlabs/nvbio/issues/41

Basically, I told it to ignore the check by adding the line below into the top level CMakeLists.txt file:

add_compile_definitions(THRUST_IGNORE_CUB_VERSION_CHECK)

Description and motivation

It seems that CUB is included along the CUDA Toolkit since version 11, which can cause issues during build (thanks @chaeunl for the valuable feedback and troubleshooting!):

$ python setup.py install -DUSE_CUDA=ON -DRPU_CUDA_ARCHITECTURES="80"
[ 14%] Built target cub
[ 50%] Built target RPU_CPU
[ 51%] Building CUDA object CMakeFiles/RPU_GPU.dir/src/rpucuda/cuda/bit_line_maker.cu.o
In file included from /usr/local/cuda-11.1/targets/x86_64-linux/include/thrust/system/cuda/detail/execution_policy.h:33:0,
                 from /usr/local/cuda-11.1/targets/x86_64-linux/include/thrust/iterator/detail/device_system_tag.h:23,
                 from /usr/local/cuda-11.1/targets/x86_64-linux/include/thrust/iterator/detail/iterator_facade_category.h:22,
                 from /usr/local/cuda-11.1/targets/x86_64-linux/include/thrust/iterator/iterator_facade.h:37,
                 from /.../aihwkit/_skbuild/linux-x86_64-3.8/cmake-build/cub-prefix/src/cub/cub/iterator/arg_index_input_iterator.cuh:48,
                 from /.../aihwkit/_skbuild/linux-x86_64-3.8/cmake-build/cub-prefix/src/cub/cub/device/device_reduce.cuh:41,
                 from /.../aihwkit/_skbuild/linux-x86_64-3.8/cmake-build/cub-prefix/src/cub/cub/cub.cuh:53,
                 from /.../aihwkit/src/rpucuda/cuda/bit_line_maker.cu:24:
/usr/local/cuda-11.1/targets/x86_64-linux/include/thrust/system/cuda/config.h:78:2: error: #error The version of CUB in your include path is not compatible with this release of Thrust. CUB is now included in the CUDA Toolkit, so you no longer need to use your own checkout of CUB. Define THRUST_IGNORE_CUB_VERSION_CHECK to ignore this.
#error The version of CUB in your include path is not compatible with this release of Thrust. CUB is now included in the CUDA Toolkit, so you no longer need to use your own checkout of CUB. Define THRUST_IGNORE_CUB_VERSION_CHECK to ignore this.

Proposed solution

We should revise the using of CUB in the build system. Currently, we make an attempt to find it, and if not possible, we automatically download and include the package. This might just not be needed entirely for cuda 11 (as it might be included in the default cuda header paths), or the THRUST_IGNORE_CUB_VERSION_CHECK flag might allow for bypass the check and use the downloaded version (which might not be ideal, though).

Alternatives and other information

find the line containing the following statement
“#ifndef THRUST_IGNORE_CUB_VERSION_CHECK”

add this command above the previous line
“#define THRUST_IGNORE_CUB_VERSION_CHECK true”

and uncomment the line
“#else THRUST_IGNORE_CUB_VERSION_CHECK”

and it's working perfectly!