AMReX-Codes/amrex

Cannot build with CUDA and profiling on CUDA versions >= 12.5

rho-novatron opened this issue · 9 comments

I'm trying to build on a fresh Ubuntu 24.04 install with requirements installed with conda. When installing cuda 12.5 or later, it cannot even pass cmake configuring, seemingly due to the change to header-based nvTools instead of library-based in nvtx3.

There is some info here on how to use nvtx with cmake these days.

Here's the error message I get:

> cmake -S . -B build -DAMReX_GPU_BACKEND=CUDA

...

CMake Error at Tools/CMake/AMReXParallelBackends.cmake:71 (target_link_libraries):
  Target "amrex_3d" links to:

    CUDA::nvToolsExt

  but the target was not found.  Possible reasons include:

    * There is a typo in the target name.
    * A find_package call is missing for an IMPORTED target.
    * An ALIAS target is missing.

Call Stack (most recent call first):
  Src/CMakeLists.txt:40 (include)

-- Generating done (0.0s)
CMake Generate step failed.  Build files cannot be regenerated correctly.

Disabling profilers makes the build work:

> cmake -S . -B build -DAMReX_GPU_BACKEND=CUDA  -DAMReX_BASE_PROFILE=OFF -DAMReX_TINY_PROFILE=OFF

...

-- Configuring done (0.6s)
-- Generating done (0.0s)
-- Build files have been written to: /home/rho/git/amrex/build

> cmake --build build -j 16

...

[ 99%] Building CUDA object Src/CMakeFiles/amrex_3d.dir/Particle/AMReX_ParticleContainerBase.cpp.o
[100%] Linking CUDA static library libamrex_3d.a
[100%] Built target amrex_3d

We have the same issue with gmake -- CUDA 12.6 changed some headers that make it incompatible with the profiling.

I thought the issue has been resolved in #4064 and it should be in 24.09. Which version of amrex are you using?

@zingale Do you still have the issue with the development branch?

This was on the current development branch as of today, commit 97fcea3 .

#4064 seems to have no changes to any CMakeLists, so it might have fixed it for gmake, but not cmake. If I'm looking at the correct documentation (this), it seems as if the headers need to be downloaded outside of cmake (or checked in, I guess) or fetched using the CMake Package Manager.

oh, I hadn't realized that PR was merged. Indeed, it links now with GNU make.

I cannot reproduce the cmake issue. It somehow works for me. Maybe my cuda installation includes old files. @rho-novatron Does it work if you make the following change?

--- a/Tools/CMake/AMReXParallelBackends.cmake
+++ b/Tools/CMake/AMReXParallelBackends.cmake
@@ -68,7 +68,7 @@ if (  AMReX_GPU_BACKEND STREQUAL "CUDA"
 
         # nvToolsExt: if tiny profiler or base profiler are on.
         if (AMReX_TINY_PROFILE OR AMReX_BASE_PROFILE)
-            target_link_libraries(amrex_${D}d PUBLIC CUDA::nvToolsExt)
+            target_link_libraries(amrex_${D}d PUBLIC nvtx3-cpp)
         endif ()
    endforeach()
 

Maybe it also depends on cmake version. I am using 3.30.3.

Applying that patch does get me through the configuration and generation, but then building fails with

cmake --build build -j 16

...

[ 52%] Building CUDA object Src/CMakeFiles/amrex_3d.dir/Base/AMReX_GpuUtility.cpp.o
/home/rho/git/amrex/Src/Base/AMReX_GpuDevice.cpp:25:12: fatal error: nvToolsExt.h: No such file or directory
   25 | #  include <nvToolsExt.h>
      |            ^~~~~~~~~~~~~~
compilation terminated.

The only current conda package I have installed that provides nvToolsExt.h is nsight-compute, and that's not included by the current config. I'm trying to find a way to convince cmake to add that to the include path...

I'm also on cmake 3.30.3.

I got it! Just installing the package nvidia::cuda-nvtx-dev makes the original current development work just fine. So, it was just my fault all the time, getting confused by the error messages. I'll try to get WarpX to add the cuda-nvtx-dev package to the list of requirements, and then this should be fine as is. Closing the issue.