LLNL/Caliper

Issue linking to cupti with 2.6.0

Opened this issue · 4 comments

When building 2.6.0 with CUDA 101.243 on Lassen I get the following error when linking the the user application

/usr/bin/ld: warning: libcupti.so.10.1, needed by /usr/WS2/corbett5/TC02350/uberenv-libs/linux-rhel7-ppc64le/clang-11.0.1/caliper-2.6.0-olcrcfanb2ezov4spvbsu732we7f7qc5/lib64/libcaliper.so.2.6.0, not found (try using -rpath or -rpath-link)
/usr/WS2/corbett5/TC02350/uberenv-libs/linux-rhel7-ppc64le/clang-11.0.1/caliper-2.6.0-olcrcfanb2ezov4spvbsu732we7f7qc5/lib64/libcaliper.so.2.6.0: undefined reference to `cuptiSetEventCollectionMode@libcupti.so.10.1'
...
Many more undefined references

This is my CMake setup

find_package(caliper REQUIRED PATHS ${CALIPER_DIR})

With the following modification it works

find_package(caliper REQUIRED PATHS ${CALIPER_DIR})
blt_register_library(NAME caliper
                     INCLUDES ${caliper_INCLUDE_PATH}
                     LIBRARIES caliper /usr/tce/packages/cuda/cuda-10.1.243/extras/CUPTI/lib64/libcupti.so
                     TREAT_INCLUDES_AS_SYSTEM ON)

I think Caliper isn't exporting its dependency on cupti. It works with CUDA 11.4.1 but my guess is that's because libcupti is in the CUDA library directory which is already added to the dynamic library search path.

Hi @corbett5 ,

Thanks for reporting. Normally Caliper adds the correct rpaths to its dependencies, including cupti. I'm not seeing this issue with a manual build. It might have something to do with spack, I think it rewrites the rpaths but it does not know about the different link directory for CUPTI.

Is your workaround a viable solution for now? If it is, I might leave it alone for now given that it only affects older cuda versions. Otherwise, if spack is the issue, then the best fix might be to update the spack package to include the cupti rpath, or I can try exporting the cupti dependency in cmake.

With spack it passes CUPTI_DIR=/usr/tce/packages/cuda/cuda-10.1.243/, but this seems OK because Caliper has some internal logic to find the real directory and can verify from the output that it does indeed find /usr/tce/packages/cuda/cuda-10.1.243/extras/CUPTI/lib64/libcupti.so.

With a manual build does the dependency get propagated to the CMake package as well?

With a manual build the rpaths to cupti should be set correctly.

I think it gets set correctly with Spack as well. I can build the tests with Spack (but not run them because they aren't installed). The issue is that when I build another executable that pulls in Caliper via find_package(caliper) it breaks. I think the solution to this lies outside of Spack and would require caliper to properly export it's cupti dependency. Here's the build output

==> caliper: Executing phase: 'cmake'
==> [2021-10-13-12:39:04.187494] 'cmake' '-G' 'Unix Makefiles' '-DCMAKE_INSTALL_PREFIX:STRING=/usr/WS2/corbett5/TC02350/uberenv-libs/linux-rhel7-ppc64le/clang-11.0.1/caliper-2.6.0-sct42xrimusssvhrp2aucrzvx2rh4mk2' '-DCMAKE_BUILD_TYPE:STRING=Release' '-DCMAKE_INTERPROCEDURAL_OPTIMIZATION:BOOL=OFF' '-DCMAKE_VERBOSE_MAKEFILE:BOOL=ON' '-DCMAKE_INSTALL_RPATH_USE_LINK_PATH:BOOL=OFF' '-DCMAKE_INSTALL_RPATH:STRING=/usr/WS2/corbett5/TC02350/uberenv-libs/linux-rhel7-ppc64le/clang-11.0.1/caliper-2.6.0-sct42xrimusssvhrp2aucrzvx2rh4mk2/lib;/usr/WS2/corbett5/TC02350/uberenv-libs/linux-rhel7-ppc64le/clang-11.0.1/caliper-2.6.0-sct42xrimusssvhrp2aucrzvx2rh4mk2/lib64;/usr/WS2/corbett5/TC02350/uberenv-libs/linux-rhel7-ppc64le/clang-11.0.1/adiak-0.2.1-smljno4fetvudi7rj545gnjafal7ld7n/lib;/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-clang-11.0.1/lib;/usr/tce/packages/cuda/cuda-10.1.243/lib64' '-DCMAKE_PREFIX_PATH:STRING=/usr/WS2/corbett5/TC02350/uberenv-libs/linux-rhel7-ppc64le/clang-11.0.1/adiak-0.2.1-smljno4fetvudi7rj545gnjafal7ld7n;/usr/tce/packages/python/python-3.8.2;/usr/tce/packages/cuda/cuda-10.1.243;/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-clang-11.0.1;/usr/tce/packages/cmake/cmake-3.14.5' '-DPYTHON_EXECUTABLE=/usr/tce/packages/python/python-3.8.2/bin/python3.8' '-DBUILD_TESTING=On' '-DBUILD_DOCS=Off' '-DBUILD_SHARED_LIBS=On' '-DWITH_ADIAK=On' '-DWITH_GOTCHA=Off' '-DWITH_PAPI=Off' '-DWITH_LIBDW=Off' '-DWITH_LIBPFM=Off' '-DWITH_SOSFLOW=Off' '-DWITH_SAMPLER=Off' '-DWITH_MPI=On' '-DWITH_FORTRAN=Off' '-DWITH_LIBUNWIND=Off' '-DMPI_C_COMPILER=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-clang-11.0.1/bin/mpicc' '-DMPI_CXX_COMPILER=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-clang-11.0.1/bin/mpicxx' '-DCUDA_TOOLKIT_ROOT_DIR=/usr/tce/packages/cuda/cuda-10.1.243' '-DCUPTI_PREFIX=/usr/tce/packages/cuda/cuda-10.1.243' '-DWITH_NVTX=On' '-DWITH_CUPTI=On' '/usr/WS2/corbett5/TC02350/uberenv-libs/builds/spack-stage-caliper-2.6.0-sct42xrimusssvhrp2aucrzvx2rh4mk2/spack-src'
-- The C compiler identification is Clang 11.0.1
-- The CXX compiler identification is Clang 11.0.1
-- Check for working C compiler: /usr/WS2/corbett5/TC02350/uberenv-libs/spack/lib/spack/env/clang/clang
-- Check for working C compiler: /usr/WS2/corbett5/TC02350/uberenv-libs/spack/lib/spack/env/clang/clang -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/WS2/corbett5/TC02350/uberenv-libs/spack/lib/spack/env/clang/clang++
-- Check for working CXX compiler: /usr/WS2/corbett5/TC02350/uberenv-libs/spack/lib/spack/env/clang/clang++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Found CUDA: /usr/tce/packages/cuda/cuda-10.1.243 (found version "10.1") 
-- Found CUPTI: /usr/tce/packages/cuda/cuda-10.1.243/extras/CUPTI/lib64/libcupti.so  
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found MPI_C: /usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib/libmpiprofilesupport.so (found version "3.1") 
-- Found MPI_CXX: /usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib/libmpiprofilesupport.so (found version "3.1") 
-- Found MPI: TRUE (found version "3.1")  
-- Found Python: /usr/tce/packages/python/python-3.8.2/bin/python3.8 (found version "3.8.2") found components:  Interpreter 
-- Configuring done
-- Generating done
-- Build files have been written to: /usr/WS2/corbett5/TC02350/uberenv-libs/builds/spack-stage-caliper-2.6.0-

And then further down in the link line for libcaliper.so it winds up being linked against /usr/tce/packages/cuda/cuda-10.1.243/extras/CUPTI/lib64/libcupti.s which is correct.