Azure/azhpc-images

Inexistent hardcoded paths in HPC-X libtool archive files

Opened this issue · 2 comments

When linking with libtool against HPC-X provided in the marketplace image the build fails due to a hardcoded path in the dependency_libs metadata of the libtool archive files included with HPC-X. Since this path (/hpc/local/oss/...) does not exist in the image, the linker cannot find the target .la files required to complete the dependency resolution which causes the build to fail.

Here is a list of all the .la files in HPC-X referencing the /hpc/local/oss/... path:

clusterkit/lib/libcuda_wrapper.la
hcoll/lib/hcoll/hmca_gpu_cuda.la
hcoll/lib/hcoll/hmca_bcol_nccl.la
hcoll/debug/lib/hcoll/hmca_gpu_cuda.la
hcoll/debug/lib/hcoll/hmca_bcol_nccl.la
nccl_rdma_sharp_plugin/lib/libnccl-net.la
ompi/lib/libmpi_usempi_ignore_tkr.la
ompi/lib/libmpi_mpifh.la
ompi/lib/libmpi_usempif08.la
ompi/tests/ipm-2.0.6/lib/libipmf.la
sharp/lib/libsharp_coll_cuda_wrapper.la
sharp/debug/lib/libsharp_coll_cuda_wrapper.la
ucx/mt/lib/ucx/libucx_perftest_cuda.la
ucx/mt/lib/ucx/libuct_xpmem.la
ucx/mt/lib/ucx/libuct_cuda.la
ucx/mt/lib/ucx/libuct_cuda_gdrcopy.la
ucx/lib/ucx/libucx_perftest_cuda.la
ucx/lib/ucx/libuct_xpmem.la
ucx/lib/ucx/libuct_cuda.la
ucx/lib/ucx/libuct_cuda_gdrcopy.la
ucx/prof/lib/ucx/libucx_perftest_cuda.la
ucx/prof/lib/ucx/libuct_xpmem.la
ucx/prof/lib/ucx/libuct_cuda.la
ucx/prof/lib/ucx/libuct_cuda_gdrcopy.la
ucx/debug/lib/ucx/libucx_perftest_cuda.la
ucx/debug/lib/ucx/libuct_xpmem.la
ucx/debug/lib/ucx/libuct_cuda.la
ucx/debug/lib/ucx/libuct_cuda_gdrcopy.la

Two potential solutions:

  1. Fix the paths to point to the correct directories in the provided stack
  2. Remove all .la files and let the linker resolve the dependencies at link time

The second solution would be preferred in order to allow customer to correctly link against any library, not only the ones provided in the HPC stack.

I'm hitting the same issue building parallel netcdf against the marketplace hpcx:

 653    libtool: warning: library '/opt/hpcx-v2.8.3-gcc-MLNX_OFED_LINUX-5.2-2.2.3.0-redh
        at7.9-x86_64/ompi/lib/libmpi_usempif08.la' was moved.

654 /bin/grep: /hpc/local/oss/gcc-9.2.0/lib/../lib64/libgfortran.la: No such file or
directory
655 /bin/sed: can't read /hpc/local/oss/gcc-9.2.0/lib/../lib64/libgfortran.la: No su
ch file or directory

I tried Davide's workaround mentioned above and removed /opt/hpcx-v2.8.3-gcc-MLNX_OFED_LINUX-5.2-2.2.3.0-redhat7.9-x86_64/ompi/lib/*.la and everything links ok now.