gschramm/parallelproj

Linking issue with libpthread

Robert-PrescientImaging opened this issue · 9 comments

This is something that apparently occurs on modern Ubuntu systems.
Understanding of this issue has tested my knowledge but I think the fix/workaround is worth documenting.

I am building on the following system (WSL Ubuntu 22.04 LTS):

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.2 LTS
Release:        22.04
Codename:       jammy

Installing with CUDA installed via sudo apt-get install nvidia-cuda-toolkit

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_18_09:45:30_PST_2021
Cuda compilation tools, release 11.5, V11.5.119
Build cuda_11.5.r11.5/compiler.30672275_0

I get the following error.

[ 52%] Linking CUDA device code CMakeFiles/parallelproj_cuda.dir/cmake_device_link.o
nvlink warning : Skipping incompatible '/lib/x86_64-linux-gnu/librt.a' when searching for -lrt
nvlink warning : Skipping incompatible '/usr/lib/x86_64-linux-gnu/librt.a' when searching for -lrt
nvlink warning : Skipping incompatible '/lib/x86_64-linux-gnu/libpthread.a' when searching for -lpthread
nvlink warning : Skipping incompatible '/usr/lib/x86_64-linux-gnu/libpthread.a' when searching for -lpthread
nvlink warning : Skipping incompatible '/lib/x86_64-linux-gnu/libdl.a' when searching for -ldl
nvlink warning : Skipping incompatible '/usr/lib/x86_64-linux-gnu/libdl.a' when searching for -ldl
nvlink fatal   : Could not open input file '/usr/lib/x86_64-linux-gnu/libpthread.a'
make[2]: *** [cuda/CMakeFiles/parallelproj_cuda.dir/build.make:211: cuda/CMakeFiles/parallelproj_cuda.dir/cmake_device_link.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:985: cuda/CMakeFiles/parallelproj_cuda.dir/all] Error 2
make: *** [Makefile:146: all] Error 2

I do not fully understand this issue but it related to /usr/lib/x86_64-linux-gnu/libpthread.a being only 8 bytes with modern version of libc?

The best description I found was here: https://matsci.org/t/lammps-users-kokkos-linker-error-nvidia-libdl-a/41050


The fix I implemented for this was to change OpenMP_pthread_LIBRARY from /usr/lib/x86_64-linux-gnu/libpthread.a to /usr/lib/x86_64-linux-gnu/libpthread.so.0

This allows parallelproj to build and the tests pass.

Any input from @gschramm or @KrisThielemans would be appreciated.

Interesting, at a first glance I have no idea why that is happening.

I just tried on my Ubuntu 22.04 system with cuda (11.8) installed from the ubuntu repo and I cannot reproduce the issue.

$lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.2 LTS
Release:	22.04
Codename:	jammy
$nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

@Robert-PrescientImaging can you try with CUDA 11.8 as well? In the conda-forge builds using cuda 10, 11.0, 11.1, 11.2, I have also never seen this issue.

Good to know. @KrisThielemans any idea why this is not a problem for the conda-forge builds with cuda < 11.7?

of course I don't! 😄 A guess: conda comes with its own compiler and libraries on Linux I believe, so maybe they did something else than Ubuntu.

Hehe. So should we just encourage Ubuntu users to use CUDA >= 11.7 then?

Another post https://forums.developer.nvidia.com/t/nvlink-fatal-could-not-open-input-file-when-linking-with-empty-static-library/208517/7 claims that this was fixed with cuda 11.7

Upgrading to CUDA 12.1 resolved the issue.

Hehe. So should we just encourage Ubuntu users to use CUDA >= 11.7 then?

You may want to but unless anyone else has this particularly niche issue, I suggest leaving this thread as documentation to refer to.

Just had the same issue for someone with cuda 11.5. (He also had another problem for Gadgetron which has been fixed in later cuda versions).

ok. just added a note + reference to this issue in the README for ubuntu users