llvm/llvm-project

Various cuda compiling problems on Windows

cloudhan opened this issue · 8 comments

NOTE: empty.cu is simply an empty file.


Incompatible STL provided by MSVC toolchain

clang++.exe -std=c++14 -x cuda -c .\empty.cu --cuda-path=$env:CUDA_PATH_V10_1 --cuda-gpu-arch=sm_61

Resulting the following error:

In file included from <built-in>:1:
In file included from C:\LLVM\LLVM-main-win64\lib\clang\15.0.0\include\__clang_cuda_runtime_wrapper.h:41:
In file included from C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.29.30133\include\cmath:9:
In file included from C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.29.30133\include\yvals.h:9:
C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.29.30133\include\yvals_core.h:545:2: error: STL1002: Unexpected compiler version, expected CUDA 10.1 Update 2 or newer.
#error STL1002: Unexpected compiler version, expected CUDA 10.1 Update 2 or newer.
 ^
1 error generated when compiling for sm_61.

The root cause is this section in microsoft/STL yvals_core.h

Adding -D_ALLOW_COMPILER_AND_STL_VERSION_MISMATCH to compiling cmd fixes it.

clang++.exe -std=c++14 -x cuda -c .\empty.cu --cuda-path=$env:CUDA_PATH_V10_1 --cuda-gpu-arch=sm_61 -D_ALLOW_COMPILER_AND_STL_VERSION_MISMATCH

will do the trick.


unknown type name 'uint32_t'

And this one cannot be fixed on the cmd.

NOTE: the difference of $env:CUDA_PATH_V11_2 in this section compares to $env:CUDA_PATH_V10_1 in the previous section.

clang++.exe -std=c++14 -x cuda -c .\empty.cu --cuda-path=$env:CUDA_PATH_V11_2 --cuda-gpu-arch=sm_61 -D_ALLOW_COMPILER_AND_STL_VERSION_MISMATCH

results

In file included from <built-in>:1:
In file included from C:\LLVM\LLVM-main-win64\lib\clang\15.0.0\include\__clang_cuda_runtime_wrapper.h:473:
C:\LLVM\LLVM-main-win64\lib\clang\15.0.0\include\__clang_cuda_intrinsics.h:512:19: error: unknown type name 'uint32_t'; did you mean 'cuuint32_t'?
__device__ inline uint32_t __nvvm_get_smem_pointer(void *__ptr) {
                  ^
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2/include\cuda.h:55:26: note: 'cuuint32_t' declared here
typedef unsigned __int32 cuuint32_t;
                         ^
1 error generated when compiling for sm_61.

Adding -### gives following info

clang version 15.0.0 (https://github.com/llvm/llvm-project.git 0e1d2007aa3c63597c6965dd265055da78bf7c51)
Target: x86_64-pc-windows-msvc
Thread model: posix
InstalledDir: C:\LLVM\LLVM-main-win64\bin
 "C:\\LLVM\\LLVM-main-win64\\bin\\clang++.exe" "-cc1" "-triple" "nvptx64-nvidia-cuda" "-aux-triple" "x86_64-pc-windows-msvc" "-S" "-disable-free" "-clear-ast-before-backend" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "empty.cu" "-mrelocation-model" "static" "-mframe-pointer=all" "-fno-rounding-math" "-fno-verbose-asm" "-no-integrated-as" "-aux-target-cpu" "x86-64" "-fcuda-is-device" "-mllvm" "-enable-memcpyopt-without-libcalls" "-mlink-builtin-bitcode" "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.2/nvvm/libdevice/libdevice.10.bc" "-target-feature" "+ptx72" "-target-sdk-version=11.2" "-target-cpu" "sm_61" "-target-feature" "+ptx72" "-mllvm" "-treat-scalable-fixed-error-as-warning" "-debugger-tuning=gdb" "-fno-dwarf-directory-asm" "-resource-dir" "C:\\LLVM\\LLVM-main-win64\\lib\\clang\\15.0.0" "-internal-isystem" "C:\\LLVM\\LLVM-main-win64\\lib\\clang\\15.0.0\\include\\cuda_wrappers" "-include" "__clang_cuda_runtime_wrapper.h" "-D" "_ALLOW_COMPILER_AND_STL_VERSION_MISMATCH" "-D" "_ALLOW_COMPILER_AND_STL_VERSION_MISMATCH" "-internal-isystem" "C:\\LLVM\\LLVM-main-win64\\lib\\clang\\15.0.0\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Enterprise\\VC\\Tools\\MSVC\\14.29.30133\\ATLMFC\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Enterprise\\VC\\Tools\\MSVC\\14.29.30133\\include" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\NETFXSDK\\4.8\\include\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\ucrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\shared" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\winrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\cppwinrt" "-internal-isystem" "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.2/include" "-internal-isystem" "C:\\LLVM\\LLVM-main-win64\\lib\\clang\\15.0.0\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Enterprise\\VC\\Tools\\MSVC\\14.29.30133\\ATLMFC\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Enterprise\\VC\\Tools\\MSVC\\14.29.30133\\include" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\NETFXSDK\\4.8\\include\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\ucrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\shared" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\winrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\cppwinrt" "-std=c++14" "-fdeprecated-macro" "-fno-autolink" "-fdebug-compilation-dir=D:\\rules_cuda\\examples\\basic" "-ferror-limit" "19" "-fmessage-length=318" "-fms-extensions" "-fms-compatibility" "-fms-compatibility-version=19.29.30141" "-fdelayed-template-parsing" "-fcxx-exceptions" "-fexceptions" "-fcolor-diagnostics" "-cuid=25dcc15944fa3f58" "-D__GCC_HAVE_DWARF2_CFI_ASM=1" "-o" "C:\\Users\\GUANGY~1\\AppData\\Local\\Temp\\empty-90ac4b\\empty-sm_61.s" "-x" "cuda" ".\\empty.cu"
 "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.2/bin\\ptxas" "-m64" "-O0" "--gpu-name" "sm_61" "--output-file" "C:\\Users\\GUANGY~1\\AppData\\Local\\Temp\\empty-78d68a\\empty-sm_61.o" "C:\\Users\\GUANGY~1\\AppData\\Local\\Temp\\empty-90ac4b\\empty-sm_61.s"
 "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.2/bin\\fatbinary" "-64" "--create" "C:\\Users\\GUANGY~1\\AppData\\Local\\Temp\\empty-17b90b.fatbin" "--image=profile=sm_61,file=C:\\Users\\GUANGY~1\\AppData\\Local\\Temp\\empty-78d68a\\empty-sm_61.o" "--image=profile=compute_61,file=C:\\Users\\GUANGY~1\\AppData\\Local\\Temp\\empty-90ac4b\\empty-sm_61.s"
 "C:\\LLVM\\LLVM-main-win64\\bin\\clang++.exe" "-cc1" "-triple" "x86_64-pc-windows-msvc19.29.30141" "-target-sdk-version=11.2" "-aux-triple" "nvptx64-nvidia-cuda" "-emit-obj" "-mrelax-all" "-mincremental-linker-compatible" "--mrelax-relocations" "-disable-free" "-clear-ast-before-backend" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "empty.cu" "-mrelocation-model" "pic" "-pic-level" "2" "-mframe-pointer=none" "-fmath-errno" "-ffp-contract=on" "-fno-rounding-math" "-mconstructor-aliases" "-funwind-tables=2" "-target-cpu" "x86-64" "-tune-cpu" "generic" "-mllvm" "-treat-scalable-fixed-error-as-warning" "-fcoverage-compilation-dir=D:\\rules_cuda\\examples\\basic" "-resource-dir" "C:\\LLVM\\LLVM-main-win64\\lib\\clang\\15.0.0" "-internal-isystem" "C:\\LLVM\\LLVM-main-win64\\lib\\clang\\15.0.0\\include\\cuda_wrappers" "-include" "__clang_cuda_runtime_wrapper.h" "-D" "_ALLOW_COMPILER_AND_STL_VERSION_MISMATCH" "-internal-isystem" "C:\\LLVM\\LLVM-main-win64\\lib\\clang\\15.0.0\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Enterprise\\VC\\Tools\\MSVC\\14.29.30133\\ATLMFC\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Enterprise\\VC\\Tools\\MSVC\\14.29.30133\\include" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\NETFXSDK\\4.8\\include\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\ucrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\shared" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\winrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\cppwinrt" "-internal-isystem" "C:\\LLVM\\LLVM-main-win64\\lib\\clang\\15.0.0\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Enterprise\\VC\\Tools\\MSVC\\14.29.30133\\ATLMFC\\include" "-internal-isystem" "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Enterprise\\VC\\Tools\\MSVC\\14.29.30133\\include" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\NETFXSDK\\4.8\\include\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\ucrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\shared" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\um" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\winrt" "-internal-isystem" "C:\\Program Files (x86)\\Windows Kits\\10\\include\\10.0.19041.0\\cppwinrt" "-internal-isystem" "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.2/include" "-std=c++14" "-fdeprecated-macro" "-fdebug-compilation-dir=D:\\rules_cuda\\examples\\basic" "-ferror-limit" "19" "-fmessage-length=318" "-fno-use-cxa-atexit" "-fms-extensions" "-fms-compatibility" "-fms-compatibility-version=19.29.30141" "-fdelayed-template-parsing" "-fcxx-exceptions" "-fexceptions" "-fcolor-diagnostics" "-fcuda-include-gpubinary" "C:\\Users\\GUANGY~1\\AppData\\Local\\Temp\\empty-17b90b.fatbin" "-cuid=25dcc15944fa3f58" "-faddrsig" "-o" "empty.o" "-x" "cuda" ".\\empty.cu"

uint32_t is refered due to "-include" "__clang_cuda_runtime_wrapper.h". Manually add -includestdint.h cannot workaround the problem.

#error STL1002: Unexpected compiler version, expected CUDA 10.1 Update 2 or newer.
 ^
1 error generated when compiling for sm_61.

The root cause is this section in microsoft/STL yvals_core.h

This particular check should not have been triggered for clang as it does not devine __CUDACC_VER_MAJOR__
https://github.com/microsoft/STL/blob/27877181dc50fc5f0dc9d679703437eb105e2b9f/stl/inc/yvals_core.h#L591
This issue has been fixed few months back: microsoft/STL#2075
It's possible that the headers you actually use don't have it yet.

@Artem-B FYI, add #include "stdint.h" after

just fix the second problem.

@tstellar Since this blocks using clang CUDA on Win, can we backport c231471?

@tstellar backporting it should be safe.

/cherry-pick c231471

/branch llvm/llvm-project-release-prs/issue54609

Merged: 79147e4