ROCm/hipBLAS

hipBLAS compiled for CUDA cannot be found in CMake

Closed this issue · 4 comments

What is the expected behavior

  • In CMake, find_package(hipblas) should work.

What actually happens

  • After updating hip and hipblas to the 4.5 release, hipblas can no longer be found be CMake on an NVIDIA platform:
CMake Error at /opt/cmake-3.21.3-linux-x86_64/share/cmake-3.21/Modules/CMakeFindDependencyMacro.cmake:47 (find_package):
  By not providing "Findhip.cmake" in CMAKE_MODULE_PATH this project has
  asked CMake to find a package configuration file provided by "hip", but
  CMake did not find one.

  Could not find a package configuration file provided by "hip" with any of
  the following names:

    hipConfig.cmake
    hip-config.cmake

  Add the installation prefix of "hip" to CMAKE_PREFIX_PATH or set "hip_DIR"
  to a directory containing one of the above files.  If "hip" provides a
  separate development package or SDK, be sure it has been installed.
Call Stack (most recent call first):
  /opt/rocm/hipblas/lib/cmake/hipblas/hipblas-config.cmake:90 (find_dependency)
  CMakeLists.txt:239 (find_package)
  • This used to work fine on all previous releases.
  • A current workaround is to change this line from DEPENDS PACKAGE hip to DEPENDS PACKAGE HIP. Then after recompiling and installing hipBLAS, it works as expected.

How to reproduce

  • On an NVIDIA machine, install hip 4.5.
  • Download/clone hipBLAS at the 4.5 release. Compile using ./install.sh -i --cuda
  • In a CMake project, using the following will output the above error.
find_package(HIP REQUIRED)
find_package(hipblas REQUIRED)
  • Looking at /opt/rocm/hipblas/lib/cmake/hipblas/hipblas-config.cmake:90 shows find_dependency(hip) whereas find_dependency(HIP) works fine (see workaround above).

Environment

Ubuntu 20.04 with CUDA 11.5, CMake 3.21.3.
HIP 4.5 installed using debian packages. hipBLAS 4.5 compiled with --cuda.

@ckendrick Thank you for reporting this error and a special thanks for providing a workaround. We will reply to this issue when the workaround is implemented to fix the hipBLAS build on an NVIDIA machine with hip 4.5.

Hi @ckendrick,

I'm trying to reproduce the issue that you're seeing, and I was wondering if you could help. What are the exact steps did you took to install HIP 4.5 on your machine? Is ROCm installed as well?

Hi @daineAMD, thanks for your response! Here are the steps I took to install HIP 4.5 on my machine:

  1. Install the CUDA Toolkit by following the steps under the ".deb (local)" option for Ubuntu 20.04.
  2. Follow the Installation Guide to setup the ROCm repo:
wget -q -O - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/debian/ ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list
sudo apt-get update
  1. HIP was installed with sudo apt install hip-dev hip-runtime-nvidia
  2. Some tools seem to rely on /opt/rocm/, so I created a symbolic link from /opt/rocm-4.5.0/: sudo ln -s /opt/rocm-4.5.0/ /opt/rocm
  3. /opt/rocm/bin/ was added to my path in ~/.bashrc: export PATH="/opt/rocm/bin/:$PATH"

The ROCm stack is not fully installed (amdgpu drivers, dkms modules, etc), but the above steps do result in the rocm-core package being installed. I think this is all that should be required based on reading the HIP installation for NVIDIA guide.

Let me know if you need any more information!

This should be resolved in #434 which will make it into a future release. Thank you @ckendrick for letting us know about this issue and having a fix ready.