-lcudart_static and -lcublas not found using meson build system
Opened this issue · 10 comments
Hello,
I've been trying to compile spral for a while now to use it later with Ipopt. First , when compiling with autotools and running make check, the test corresponding to ssids_test fails with a segmentation fault. Now I'm trying to compile it using the meson build system also without luck. I will appreciate if you could help me with that.
These are the outputs corresponding to the commands in the README file:
meson setup builddir -Dexamples=true -Dtests=true -Dlibblas=openblas -Dliblapack=openblas -Dlibmetis=coinmetis
The Meson build system
Version: 1.4.0
Source dir: /home/ar612/Installers/SPRAL/spral
Build dir: /home/ar612/Installers/SPRAL/spral/builddir
Build type: native build
Project name: SPRAL
Project version: 2024.05.08
Fortran compiler for the host machine: gfortran (gcc 11.4.0 "GNU Fortran (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0")
Fortran linker for the host machine: gfortran ld.bfd 2.38
C compiler for the host machine: cc (gcc 11.4.0 "cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0")
C linker for the host machine: cc ld.bfd 2.38
C++ compiler for the host machine: c++ (gcc 11.4.0 "c++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0")
C++ linker for the host machine: c++ ld.bfd 2.38
Host machine cpu family: x86_64
Host machine cpu: x86_64
Cuda compiler for the host machine: nvcc (nvcc 12.5.40
Build cuda_12.5.r12.5/compiler.34177558_0)
Cuda linker for the host machine: nvcc nvlink 12.5.40
Build cuda_12.5.r12.5/compiler.34177558_0
Library openblas found: YES
Library openblas found: YES
Library coinmetis found: YES
Library hwloc found: YES
Run-time dependency CUDA (modules: cudart_static, rt, pthread, dl, cublas) found: YES 12.5 (/usr/local/cuda-12.5)
Library m found: YES
Has header "cblas.h" : YES
Has header "hwloc.h" : YES
Build targets in project: 45
SPRAL 2024.05.08
User defined options
examples : true
libblas : openblas
liblapack: openblas
libmetis : coinmetis
tests : true
Found ninja-1.10.1 at /usr/bin/ninja
meson compile -C builddir
INFO: autodetecting backend as ninja
INFO: calculating backend command to run: /usr/bin/ninja -C /home/ar612/Installers/SPRAL/spral/builddir
ninja: Entering directory `/home/ar612/Installers/SPRAL/spral/builddir'
[118/195] Linking target libspral.so
FAILED: libspral.so
gfortran -o libspral.so libspral.so.p/interfaces_C_lsmr.f90.o libspral.so.p/interfaces_C_matrix_util.f90.o libspral.so.p/interfaces_C_random.f90.o libspral.so.p/interfaces_C_random_matrix.f90.o libspral.so.p/interfaces_C_rutherford_boeing.f90.o libspral.so.p/interfaces_C_scaling.f90.o libspral.so.p/interfaces_C_ssids.f90.o libspral.so.p/interfaces_C_ssmfe.f90.o libspral.so.p/interfaces_C_ssmfe_core.f90.o libspral.so.p/interfaces_C_ssmfe_expert.f90.o libspral.so.p/src_cuda_cuda.f90.o libspral.so.p/src_hw_topology_hw_topology.f90.o libspral.so.p/src_ssids_cpu_cpu_iface.f90.o libspral.so.p/src_ssids_cpu_subtree.f90.o libspral.so.p/src_ssids_gpu_alloc.f90.o libspral.so.p/src_ssids_gpu_cpu_solve.f90.o libspral.so.p/src_ssids_gpu_datatypes.f90.o libspral.so.p/src_ssids_gpu_dense_factor.f90.o libspral.so.p/src_ssids_gpu_factor.f90.o libspral.so.p/src_ssids_gpu_interfaces.f90.o libspral.so.p/src_ssids_gpu_smalloc.f90.o libspral.so.p/src_ssids_gpu_solve.f90.o libspral.so.p/src_ssids_gpu_subtree.f90.o libspral.so.p/src_ssids_akeep.f90.o libspral.so.p/src_ssids_anal.F90.o libspral.so.p/src_ssids_contrib.f90.o libspral.so.p/src_ssids_contrib_free.f90.o libspral.so.p/src_ssids_datatypes.f90.o libspral.so.p/src_ssids_fkeep.F90.o libspral.so.p/src_ssids_inform.f90.o libspral.so.p/src_ssids_profile_iface.f90.o libspral.so.p/src_ssids_ssids.f90.o libspral.so.p/src_ssids_subtree.f90.o libspral.so.p/src_ssmfe_core.f90.o libspral.so.p/src_ssmfe_expert.f90.o libspral.so.p/src_ssmfe_ssmfe.f90.o libspral.so.p/src_blas_iface.f90.o libspral.so.p/src_core_analyse.f90.o libspral.so.p/src_lapack_iface.f90.o libspral.so.p/src_lsmr.f90.o libspral.so.p/src_match_order.f90.o libspral.so.p/src_matrix_util.f90.o libspral.so.p/src_pgm.f90.o libspral.so.p/src_random.f90.o libspral.so.p/src_random_matrix.f90.o libspral.so.p/src_rutherford_boeing.f90.o libspral.so.p/src_scaling.f90.o libspral.so.p/src_timer.f90.o libspral.so.p/src_metis5_wrapper.F90.o libspral.so.p/src_hw_topology_guess_topology.cxx.o libspral.so.p/src_ssids_cpu_kernels_cholesky.cxx.o libspral.so.p/src_ssids_cpu_kernels_ldlt_app.cxx.o libspral.so.p/src_ssids_cpu_kernels_ldlt_nopiv.cxx.o libspral.so.p/src_ssids_cpu_kernels_ldlt_tpp.cxx.o libspral.so.p/src_ssids_cpu_kernels_wrappers.cxx.o libspral.so.p/src_ssids_cpu_NumericSubtree.cxx.o libspral.so.p/src_ssids_cpu_SymbolicSubtree.cxx.o libspral.so.p/src_ssids_cpu_ThreadStats.cxx.o libspral.so.p/src_ssids_profile.cxx.o libspral.so.p/src_compat.cxx.o libspral.so.p/src_omp.cxx.o libspral.so.p/src_cuda_api_wrappers.cu.o libspral.so.p/src_ssids_gpu_kernels_assemble.cu.o libspral.so.p/src_ssids_gpu_kernels_dense_factor.cu.o libspral.so.p/src_ssids_gpu_kernels_reorder.cu.o libspral.so.p/src_ssids_gpu_kernels_solve.cu.o libspral.so.p/src_ssids_gpu_kernels_syrk.cu.o -L/usr/lib/gcc/x86_64-linux-gnu/11 -L/usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/11/../../../../lib -L/usr/lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/11/../../.. -L/lib -Wl,--as-needed -Wl,--no-undefined -Wl,-O1 -shared -fPIC -Wl,-soname,libspral.so -fopenmp -Wl,--start-group -lstdc++ -lopenblas -lopenblas -lcoinmetis -lhwloc -lrt -lpthread -ldl -lcudart_static -lcublas -lm -lgfortran -Wl,--end-group
/usr/bin/ld: cannot find -lcudart_static: No such file or directory
/usr/bin/ld: cannot find -lcublas: No such file or directory
collect2: error: ld returned 1 exit status
[119/195] Compiling C++ object kernelst_cpp.p/tests_ssids_kernels_ldlt_app.cxx.o
ninja: build stopped: subcommand failed.
However, the cuda library path is included in LD_LIBRARY_PATH
echo $LD_LIBRARY_PATH
/usr/local/lib:/usr/local/cuda-12.5/lib64:/usr/local/cuda-12.5/lib64:/usr/local/cuda-12.5/lib64:
ls /usr/local/cuda-12.5/lib64
cmake libcufftw.so.11 libcusolver_lapack_static.a libnppicc.so.12.3.0.116 libnppisu.so.12.3.0.116 libnvjpeg_static.a
libaccinj64.so libcufftw.so.11.2.3.18 libcusolver_metis_static.a libnppicc_static.a libnppisu_static.a libnvperf_host.so
libaccinj64.so.12.5 libcufftw_static.a libcusolverMg.so libnppidei.so libnppitc.so libnvperf_host_static.a
libaccinj64.so.12.5.39 libcufile_rdma.so libcusolverMg.so.11 libnppidei.so.12 libnppitc.so.12 libnvperf_target.so
libcheckpoint.so libcufile_rdma.so.1 libcusolverMg.so.11.6.2.40 libnppidei.so.12.3.0.116 libnppitc.so.12.3.0.116 libnvptxcompiler_static.a
libcublasLt.so libcufile_rdma.so.1.10.0 libcusolver.so libnppidei_static.a libnppitc_static.a libnvrtc-builtins.so
libcublasLt.so.12 libcufile_rdma_static.a libcusolver.so.11 libnppif.so libnpps.so libnvrtc-builtins.so.12.5
libcublasLt.so.12.5.2.13 libcufile.so libcusolver.so.11.6.2.40 libnppif.so.12 libnpps.so.12 libnvrtc-builtins.so.12.5.40
libcublasLt_static.a libcufile.so.0 libcusolver_static.a libnppif.so.12.3.0.116 libnpps.so.12.3.0.116 libnvrtc-builtins_static.a
libcublas.so libcufile.so.1.10.0 libcusparse.so libnppif_static.a libnpps_static.a libnvrtc.so
libcublas.so.12 libcufile_static.a libcusparse.so.12 libnppig.so libnvblas.so libnvrtc.so.12
libcublas.so.12.5.2.13 libcufilt.a libcusparse.so.12.4.1.24 libnppig.so.12 libnvblas.so.12 libnvrtc.so.12.5.40
libcublas_static.a libcuinj64.so libcusparse_static.a libnppig.so.12.3.0.116 libnvblas.so.12.5.2.13 libnvrtc_static.a
libcudadevrt.a libcuinj64.so.12.5 libmetis_static.a libnppig_static.a libnvfatbin.so libnvToolsExt.so
libcudart.so libcuinj64.so.12.5.39 libnppc.so libnppim.so libnvfatbin.so.12 libnvToolsExt.so.1
libcudart.so.12 libculibos.a libnppc.so.12 libnppim.so.12 libnvfatbin.so.12.5.39 libnvToolsExt.so.1.0.0
libcudart.so.12.5.39 libcupti.so libnppc.so.12.3.0.116 libnppim.so.12.3.0.116 libnvfatbin_static.a libOpenCL.so
libcudart_static.a libcupti.so.12 libnppc_static.a libnppim_static.a libnvJitLink.so libOpenCL.so.1
libcufft.so libcupti.so.2024.2.0 libnppial.so libnppist.so libnvJitLink.so.12 libOpenCL.so.1.0
libcufft.so.11 libcupti_static.a libnppial.so.12 libnppist.so.12 libnvJitLink.so.12.5.40 libOpenCL.so.1.0.0
libcufft.so.11.2.3.18 libcurand.so libnppial.so.12.3.0.116 libnppist.so.12.3.0.116 libnvJitLink_static.a libpcsamplingutil.so
libcufft_static.a libcurand.so.10 libnppial_static.a libnppist_static.a libnvjpeg.so stubs
libcufft_static_nocallback.a libcurand.so.10.3.6.39 libnppicc.so libnppisu.so libnvjpeg.so.12
libcufftw.so libcurand_static.a libnppicc.so.12 libnppisu.so.12 libnvjpeg.so.12.3.2.38
@amontoison shouldn't meson be picking up CUBLAS here?
@jfowkes the compilation must be done with the nvfortran
compiler if you do it on GPU.
FC=nvfortran meson setup builddir ...
Dear @amontoison,
I've tried what you suggested, but it still doesn't work. Here are the outputs of the commands I used.
FC=nvfortran meson setup builddir -Dexamples=true -Dtests=true -Dlibblas=openblas -Dliblapack=openblas -Dlibmetis=coinmetis
The Meson build system
Version: 1.4.0
Source dir: /home/ar612/Installers/SPRAL/spral
Build dir: /home/ar612/Installers/SPRAL/spral/builddir
Build type: native build
Project name: SPRAL
Project version: 2024.05.08
Fortran compiler for the host machine: nvfortran (nvidia_hpc 24.7-0)
Fortran linker for the host machine: nvfortran pgi 24.7-0
C compiler for the host machine: cc (gcc 11.4.0 "cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0")
C linker for the host machine: cc ld.bfd 2.38
C++ compiler for the host machine: c++ (gcc 11.4.0 "c++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0")
C++ linker for the host machine: c++ ld.bfd 2.38
Host machine cpu family: x86_64
Host machine cpu: x86_64
Cuda compiler for the host machine: nvcc (nvcc 12.5.82
Build cuda_12.5.r12.5/compiler.34385749_0)
Cuda linker for the host machine: nvcc nvlink 12.5.40
Build cuda_12.5.r12.5/compiler.34177558_0
Library openblas found: YES
Library openblas found: YES
Library coinmetis found: YES
Library hwloc found: YES
Run-time dependency CUDA (modules: cudart_static, rt, pthread, dl, cublas) found: YES 12.5 (/usr/local/cuda-12.5)
Library m found: YES
Has header "cblas.h" : YES
Has header "hwloc.h" : YES
Build targets in project: 45
SPRAL 2024.05.08
User defined options
examples : true
libblas : openblas
liblapack: openblas
libmetis : coinmetis
tests : true
Found ninja-1.10.1 at /usr/bin/ninja
FC=nvfortran meson compile -C builddir
INFO: autodetecting backend as ninja
INFO: calculating backend command to run: /usr/bin/ninja -C /home/ar612/Installers/SPRAL/spral/builddir
ninja: Entering directory `/home/ar612/Installers/SPRAL/spral/builddir'
[108/195] Compiling Fortran object libspral.so.p/src_ssids_gpu_factor.f90.o
FAILED: libspral.so.p/src_ssids_gpu_factor.f90.o libspral.so.p/spral_ssids_gpu_factor.mod
nvfortran -Ilibspral.so.p -I. -I.. -Iinclude -I../include -Isrc -I../src -I/usr/local/cuda-12.5/include -O3 -mp -fPIC -module libspral.so.p -o libspral.so.p/src_ssids_gpu_factor.f90.o -c ../src/ssids/gpu/factor.f90
NVFORTRAN-F-0000-Internal compiler error. mk_assign_sptr: upper bound missing 0 (../src/ssids/gpu/factor.f90: 101)
NVFORTRAN/x86-64 Linux 24.7-0: compilation aborted
[110/195] Compiling C++ object kernelst_cpp.p/tests_ssids_kernels_ldlt_app.cxx.o
ninja: build stopped: subcommand failed.
It seems to be an issue with the new version of the nvfortran
compiler.
It was working with the version 23.x
.
@jfowkes Can you fix the error in src/ssids/gpu/factor.f90
or is the issue in the compiler?
@amontoison unfortunately this is an internal compiler error:
NVFORTRAN-F-0000-Internal compiler error. mk_assign_sptr: upper bound missing
and as such a bug introduced by NVIDIA.
@jfowkes
I quickly checked the line 101 and at the line 100, we have in additional space in the goto
:
https://github.com/ralna/spral/blob/master/src/ssids/gpu/factor.f90#L100
It's maybe related ?!
@amontoison very well spotted! I will fix that now (it should work nonetheless).
@aleramos119 could you try again on the latest version from the master branch? If that fixes it I'll do a new release.
I'm sorry for the late response @jfowkes. I tried the latest version of the master branch, but the error persists.
rm -rf spral
git clone -b master https://github.com/ralna/spral.git
cd spral
FC=nvfortran meson setup builddir -Dexamples=true -Dtests=true -Dlibblas=openblas -Dliblapack=openblas -Dlibmetis=coinmetis
FC=nvfortran meson compile -C builddir
Cloning into 'spral'...
remote: Enumerating objects: 11545, done.
remote: Counting objects: 100% (1082/1082), done.
remote: Compressing objects: 100% (392/392), done.
remote: Total 11545 (delta 785), reused 828 (delta 687), pack-reused 10463 (from 1)
Receiving objects: 100% (11545/11545), 7.97 MiB | 5.38 MiB/s, done.
Resolving deltas: 100% (8686/8686), done.
The Meson build system
Version: 1.4.0
Source dir: /home/ar612/Installers/SPRAL/spral
Build dir: /home/ar612/Installers/SPRAL/spral/builddir
Build type: native build
Project name: SPRAL
Project version: 2024.05.08
Fortran compiler for the host machine: nvfortran (nvidia_hpc 24.7-0)
Fortran linker for the host machine: nvfortran pgi 24.7-0
C compiler for the host machine: cc (gcc 11.4.0 "cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0")
C linker for the host machine: cc ld.bfd 2.38
C++ compiler for the host machine: c++ (gcc 11.4.0 "c++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0")
C++ linker for the host machine: c++ ld.bfd 2.38
Host machine cpu family: x86_64
Host machine cpu: x86_64
Cuda compiler for the host machine: nvcc (nvcc 12.5.82
Build cuda_12.5.r12.5/compiler.34385749_0)
Cuda linker for the host machine: nvcc nvlink 12.5.40
Build cuda_12.5.r12.5/compiler.34177558_0
Library openblas found: YES
Library openblas found: YES
Library coinmetis found: YES
Library hwloc found: YES
Run-time dependency CUDA (modules: cudart_static, rt, pthread, dl, cublas) found: YES 12.5 (/usr/local/cuda-12.5)
Library m found: YES
Has header "cblas.h" : YES
Has header "hwloc.h" : YES
Build targets in project: 45
SPRAL 2024.05.08
User defined options
examples : true
libblas : openblas
liblapack: openblas
libmetis : coinmetis
tests : true
Found ninja-1.10.1 at /usr/bin/ninja
INFO: autodetecting backend as ninja
INFO: calculating backend command to run: /usr/bin/ninja -C /home/ar612/Installers/SPRAL/spral/builddir
ninja: Entering directory `/home/ar612/Installers/SPRAL/spral/builddir'
[107/195] Compiling Fortran object libspral.so.p/src_ssids_gpu_factor.f90.o
FAILED: libspral.so.p/src_ssids_gpu_factor.f90.o libspral.so.p/spral_ssids_gpu_factor.mod
nvfortran -Ilibspral.so.p -I. -I.. -Iinclude -I../include -Isrc -I../src -I/usr/local/cuda-12.5/include -O3 -mp -fPIC -module libspral.so.p -o libspral.so.p/src_ssids_gpu_factor.f90.o -c ../src/ssids/gpu/factor.f90
NVFORTRAN-F-0000-Internal compiler error. mk_assign_sptr: upper bound missing 0 (../src/ssids/gpu/factor.f90: 101)
NVFORTRAN/x86-64 Linux 24.7-0: compilation aborted
[110/195] Compiling C++ object kernelst_cpp.p/tests_ssids_kernels_ldlt_app.cxx.o
ninja: build stopped: subcommand failed.
make: *** No rule to make target 'check'. Stop.
Thank you @aleramos119, so unfortunately this is an Nvidia internal compiler bug:
NVFORTRAN-F-0000-Internal compiler error. mk_assign_sptr: upper bound missing
Nothing we can do until Nvidia fix it I'm afraid (may be worth reporting to them).