ROCm/ROCm-Device-Libs

6.0.2 [Issue]: build fails

Closed this issue · 9 comments

Problem Description

camake setting ("cmake -L" output)

-- Cache values
CMAKE_BUILD_TYPE:STRING=RelWithDebInfo
CMAKE_INSTALL_PREFIX:PATH=/usr
CPACK_GENERATOR:STRING=DEB;RPM
Clang_DIR:PATH=/usr/lib64/cmake/clang
LLVM_DIR:PATH=/usr/lib64/cmake/llvm
ROCM_CCACHE_BUILD:BOOL=OFF
ROCM_DEVICE_LIBS_BITCODE_INSTALL_LOC_NEW:STRING=
ROCM_DEVICE_LIBS_BITCODE_INSTALL_LOC_OLD:STRING=
ROCM_DIR:PATH=ROCM_DIR-NOTFOUND

Operating System

Linux x86/64

CPU

Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz

GPU

N/A

ROCm Version

ROCm 6.0.0
6.0.2

ROCm Component

ROCm-Device-Libs

Steps to Reproduce

  • configure source tree using cmake
  • make

Build fails with:

[ 77%] Generating cg.bc
cd /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-6.0.2/x86_64-redhat-linux-gnu/ockl && /usr/bin/clang-18 -I/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-6.0.2/ockl/../irif/inc -I/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-6.0.2/ockl/../oclc/inc -I/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-6.0.2/ockl/inc -fcolor-diagnostics -Werror -Wno-error=atomic-alignment -x cl -Xclang -cl-std=CL2.0 -target amdgcn-amd-amdhsa -fvisibility=protected -fomit-frame-pointer -Xclang -finclude-default-header -Xclang -fexperimental-strict-floating-point -nogpulib -cl-no-stdinc -Xclang -mcode-object-version=none -emit-llvm -Xclang -mlink-builtin-bitcode -Xclang /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-6.0.2/x86_64-redhat-linux-gnu/irif/irif.bc -c /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-6.0.2/ockl/src/cg.cl -o /home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-6.0.2/x86_64-redhat-linux-gnu/ockl/cg.bc
/home/tkloczko/rpmbuild/BUILD/ROCm-Device-Libs-rocm-6.0.2/ockl/src/cg.cl:91:5: error: '__builtin_amdgcn_ds_gws_init' needs target feature gws
   91 |     __builtin_amdgcn_ds_gws_init(nwm1, rid);
      |     ^
1 error generated.

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

N/A

Additional Information

N/A

Hello @kloczek. This is happening because your device library sources are not in sync with your compiler. (Here, I think your device libs sources are too old.) We moved the device library sources into https://github.com/ROCm/llvm-project/tree/amd-staging/amd/device-libs to ensure that a single hash of compiler, device libs, comgr, and hipcc are consistent.

Hello @kloczek. This is happening because your device library sources are not in sync with your compiler.

What you mean "not in sync with your compiler"?
Do you want to say that latest release is not ready to be used with LLVM 18.1.x? 🤔
If yes .. why not there is no checking those versions in cmake? 🤔

All I mean to say is that the device libs and compiler can't get too far apart without hitting problems like you've hit.

I'm not sure what you mean by "latest release" but, e.g. the device libs sources tagged for ROCm 6.1 may indeed be too old for LLVM 18.1.x. On the other hand, the latest staging sources in https://github.com/ROCm/llvm-project/tree/amd-staging/amd/device-libs may be too new for LLVM 18.1.x.

Regarding versioning, I'm not really sure it's possible. There can, and have been, instances of a given hash of the device library not building with any LLVM release compiler.

All I mean to say is that the device libs and compiler can't get too far apart without hitting problems like you've hit.

I'm not sure what you mean by "latest release" but, e.g. the device libs sources tagged for ROCm 6.1 may indeed be too old for LLVM 18.1.x. On the other hand, the latest staging sources in https://github.com/ROCm/llvm-project/tree/amd-staging/amd/device-libs may be too new for LLVM 18.1.x.

OK so do you have any plans to release ROCm which will be possible to use with LLVL 18.1.4? 🤔
it is already almost 2 months since LLVM 18.1.0.

If exact ROCm version needs to be used with exact version of the LLVM stack it would me really good to add checking LLVM components version instead be surprised by some errors on compile.

Each version of ROCm ships with an LLVM compiler matched to and tested with the device library and all other ROCm components. And the device libraries come prebuilt. Can you use those?

Each version of ROCm ships with an LLVM compiler matched to and tested with the device library and all other ROCm components

AFAIK ROCm-Device-Libs do not depends on anything else than LLVM/c compiler.
Do imply that what comes with LLV is not up-to-date? 🤔

If you are asking whether the compiler shipping with each ROCm release is up-to-date with the latest LLVM release or LLVM trunk, the answer is no. After the release is branched, ROCm releases go through periods of stabilization and testing.

So why not push some necessary changes to the LLVM? 🤔

Can you try building with the updated LLVM and device-library sources at https://github.com/ROCm/llvm-project and https://github.com/ROCm/llvm-project/amd/device-libs?

If you're still having problems with the updated sources, can you open a new issue at https://github.com/ROCm/llvm-project with the 'device-libs' tag? Thanks!