openucx/ucx

Package version conflict

AtticusBeachy opened this issue · 1 comments

Describe the bug

When I try to install the CUDA version of UCX, the install fails because of a conflict with the already installed version. Using --auto-deconfigure to force the installation works but breaks the package manager.

Steps to Reproduce

Download UCX from https://github.com/openucx/ucx/releases
(Selected version: ucx-1.17.0-ubuntu22.04-mofed5-cuda12-x86_64)
Files are ucx-1.17.0.deb, ucx-cuda-1.17.0.deb, and ucx-xpmem-1.17.0.deb

Try to install first file:

sudo dpkg -i ucx-1.17.0.deb

Output error:

dpkg: considering removing libucx0:amd64 in favour of ucx ...
dpkg: no, cannot proceed with removal of libucx0:amd64 (--auto-deconfigure will help):
 libopenmpi3:amd64 depends on libucx0 (>= 1.12.1~rc2)
  libucx0:amd64 is to be removed.

dpkg: regarding ucx-1.17.0.deb containing ucx:
 ucx conflicts with libucx0 (<< 1.17.4ef9a09)
  libucx0:amd64 (version 1.12.1~rc2-1) is present and installed.

dpkg: error processing archive ucx-1.17.0.deb (--install):
 conflicting packages - not installing ucx
Errors were encountered while processing:
 ucx-1.17.0.deb

Using the command

sudo dpkg -i --auto-deconfigure  ucx-1.17.0.deb

works, but breaks the package manager. If I try to install anything else, the package manager refuses to do so until I revert to the old version of UCX.

Setup and versions

OS: Linux Mint 21.3 Cinnamon
CPU architecture: x86_64

  • GPU type: NVIDIA RTX 40 series (Lovelace architecture)
  • Cuda:
    • Drivers version: CUDA 12.4
    • Check if peer-direct is loaded: (it is not) and/or gdrcopy: (it is not)

Underlying goal

I am trying to install Open MPI with CUDA-aware support, as described here. This requires pointing to a ucx-cuda install during installation:

./configure --with-cuda=/usr/local/cuda --with-ucx=/path/to/ucx-cuda-install
make -j8 install

That is why I am trying to install the CUDA version of UCX. If there is a better way of achieving the goal, please let me know.

(For instance, if there was a way to compile the CUDA version of UCX from source, I believe that would solve the issue. I would use --prefix to install to a particular location, and point Open MPI to that location.)

I figured out that there is a way to compile the CUDA version of UCX from source using the --with-cuda flag. After downloading the .tar.gz from https://github.com/openucx/ucx/releases, the appropriate commands are:

tar -xzf <file>
cd <extracted-folder>
mkdir build
cd build
../contrib/configure-release --prefix=/usr/local --with-cuda=/usr/local/cuda-12.4
make -j4
make install