Can't find `nvToolsExt` during build
Opened this issue · 2 comments
kvablack commented
Hi, I'm trying to install TransformerEngine for JAX. I prefer to install cuda-toolkit via conda, and it seems like most CUDA libraries (e.g., cuDNN) and being found, but the install fails due to not being able to find nvToolsExt:
-- Found Threads: TRUE
-- cudnn found at /home/black/miniforge3/envs/monopi/lib/libcudnn.so.
CMake Warning (dev) at /home/black/miniforge3/envs/monopi/share/cmake-3.29/Modules/FindPackageHandleStandardArgs.cmake:438 (message):
The package name passed to `find_package_handle_standard_args` (LIBRARY)
does not match the name of the calling package (CUDNN). This can lead to
problems in calling code that expects `find_package` result variables
(e.g., `_FOUND`) to follow a certain pattern.
Call Stack (most recent call first):
cmake/FindCUDNN.cmake:44 (find_package_handle_standard_args)
CMakeLists.txt:24 (find_package)
This warning is for project developers. Use -Wno-dev to suppress it.
-- Found LIBRARY: /home/black/miniforge3/envs/monopi/include
-- cuDNN: /home/black/miniforge3/envs/monopi/lib/libcudnn.so
-- cuDNN: /home/black/miniforge3/envs/monopi/include
-- cudnn_adv_infer found at /home/black/miniforge3/envs/monopi/lib/libcudnn_adv_infer.so.
-- cudnn_adv_train found at /home/black/miniforge3/envs/monopi/lib/libcudnn_adv_train.so.
-- cudnn_cnn_infer found at /home/black/miniforge3/envs/monopi/lib/libcudnn_cnn_infer.so.
-- cudnn_cnn_train found at /home/black/miniforge3/envs/monopi/lib/libcudnn_cnn_train.so.
-- cudnn_ops_infer found at /home/black/miniforge3/envs/monopi/lib/libcudnn_ops_infer.so.
-- cudnn_ops_train found at /home/black/miniforge3/envs/monopi/lib/libcudnn_ops_train.so.
-- Found Python: /home/black/miniforge3/envs/monopi/bin/python3.10 (found version "3.10.14") found components: Interpreter Development.Module
-- JAX support: ON
-- Performing Test HAS_FLTO
-- Performing Test HAS_FLTO - Success
-- Found pybind11: /tmp/pip-req-build-d2rbhz82/.eggs/pybind11-2.12.0-py3.10.egg/pybind11/include (found version "2.12.0")
-- Configuring done (1.9s)
CMake Error at common/CMakeLists.txt:54 (target_link_libraries):
Target "transformer_engine" links to:
CUDA::nvToolsExt
but the target was not found. Possible reasons include:
* There is a typo in the target name.
* A find_package call is missing for an IMPORTED target.
* An ALIAS target is missing.
Even though it exists in lib/
:
❯ find ~/miniforge3 -name "*nvToolsExt*"
/home/black/miniforge3/envs/monopi/targets/x86_64-linux/lib/libnvToolsExt.so.1
/home/black/miniforge3/envs/monopi/targets/x86_64-linux/lib/libnvToolsExt.so.1.0.0
/home/black/miniforge3/envs/monopi/nsight-compute/2024.1.1/host/target-linux-x64/nvtx/include/nvtx3/nvToolsExtOpenCL.h
/home/black/miniforge3/envs/monopi/nsight-compute/2024.1.1/host/target-linux-x64/nvtx/include/nvtx3/nvToolsExtSync.h
/home/black/miniforge3/envs/monopi/nsight-compute/2024.1.1/host/target-linux-x64/nvtx/include/nvtx3/nvToolsExt.h
/home/black/miniforge3/envs/monopi/nsight-compute/2024.1.1/host/target-linux-x64/nvtx/include/nvtx3/nvToolsExtCudaRt.h
/home/black/miniforge3/envs/monopi/nsight-compute/2024.1.1/host/target-linux-x64/nvtx/include/nvtx3/nvToolsExtCuda.h
/home/black/miniforge3/envs/monopi/lib/libnvToolsExt.so.1
/home/black/miniforge3/envs/monopi/lib/libnvToolsExt.so.1.0.0
/home/black/miniforge3/pkgs/nsight-compute-2024.1.1.4-0/nsight-compute/2024.1.1/host/target-linux-x64/nvtx/include/nvtx3/nvToolsExtOpenCL.h
/home/black/miniforge3/pkgs/nsight-compute-2024.1.1.4-0/nsight-compute/2024.1.1/host/target-linux-x64/nvtx/include/nvtx3/nvToolsExtSync.h
/home/black/miniforge3/pkgs/nsight-compute-2024.1.1.4-0/nsight-compute/2024.1.1/host/target-linux-x64/nvtx/include/nvtx3/nvToolsExt.h
/home/black/miniforge3/pkgs/nsight-compute-2024.1.1.4-0/nsight-compute/2024.1.1/host/target-linux-x64/nvtx/include/nvtx3/nvToolsExtCudaRt.h
/home/black/miniforge3/pkgs/nsight-compute-2024.1.1.4-0/nsight-compute/2024.1.1/host/target-linux-x64/nvtx/include/nvtx3/nvToolsExtCuda.h
/home/black/miniforge3/pkgs/cuda-nvtx-12.1.105-h59595ed_0/targets/x86_64-linux/lib/libnvToolsExt.so.1
/home/black/miniforge3/pkgs/cuda-nvtx-12.1.105-h59595ed_0/targets/x86_64-linux/lib/libnvToolsExt.so.1.0.0
/home/black/miniforge3/pkgs/cuda-nvtx-12.1.105-h59595ed_0/lib/libnvToolsExt.so.1
/home/black/miniforge3/pkgs/cuda-nvtx-12.1.105-h59595ed_0/lib/libnvToolsExt.so.1.0.0
/home/black/miniforge3/pkgs/cuda-nvtx-12.5.39-he02047a_0/targets/x86_64-linux/lib/libnvToolsExt.so.1
/home/black/miniforge3/pkgs/cuda-nvtx-12.5.39-he02047a_0/targets/x86_64-linux/lib/libnvToolsExt.so.1.0.0
/home/black/miniforge3/pkgs/cuda-nvtx-12.5.39-he02047a_0/lib/libnvToolsExt.so.1
/home/black/miniforge3/pkgs/cuda-nvtx-12.5.39-he02047a_0/lib/libnvToolsExt.so.1.0.0
timmoon10 commented
I see CUDA::nvToolsExt
is deprecated as of CMake 3.25, but I don't see any indication that it's been removed. I see you're building with CMake 3.29, but I also build frequently with CMake 3.29.5 without problems. I wonder if there's some other difference in our build environments.
If the deprecation of CUDA::nvToolsExt
is actually the root cause, it should just require changing to use the CUDA::nvtx3
target. Can you try building with #943?
zlsh80826 commented