TL;DR: Use a special conda
channel that ships packages compiled for ppc64le
, for convenience edit the .bashrc
to load the correct conda distribution for the processor architecture. This guide uses conda, to use docker images you can have a look here
On nodes that use the ppc64le
processor architecture instead of the "standard" x86_64
, any compiled package or tool will need to be recompiled for the new architecture. For Machine Learning workloads, this mostly concerns conda
and packages with a compiled backend like pytorch
or tensorflow
.
Compiling from source is tricky and can take a very long time, but luckily there are a few tricks to make things easier using specific open-source conda
channels which ship pre-compiled versions for ppc64le
. To enable a seamless experience across processor architecures, we need to take the following steps:
- Install the correct conda distribution (not just creating a new environment) for the processor architecture
- Edit your
.bashrc
so that the correct conda distribution is loaded depending on the current architecture - Create environments and install your packages
For ppc64le
, we need to use mambaforge, which is a fork of the regular conda
command and works exactly the same, except that you use mamba
instead of conda
. As a bonus, it is also much faster than conda
. For x86_64
, we can use whatever conda
distribution we prefer, or even pip
. For simplicity, I recommend to use mambaforge
as well.
More about mamba and conda
If you really want to, you can probably get away without using mamba
, if you do it is still recommended to use miniforge (same repo as mambaforge) instead of other conda
distributions because it is optimized for ppc64le
. Specifically for pytorch
, using mamba
makes the installation much easier (and also much faster).
If you use mamba
, you can still use conda
to issue the same commands. I.e. if you suspect some bug in mamba
is causing an install to fail you can also try conda install ...
. At your own risk (as in this might or might not casue issues), you can also choose to use mamba
only for the install commands (which is where mamba
makes a lot of improvements) and keep using conda
for the rest.
- Log into a node that uses the
ppc64le
architecture🚨 Never do development or installations on the
summon
node. You can usesalloc -A <account> --mem 5G
to quickly spawn a jobshell. - Download the mambaforge installation script for
ppc64le
from the mambaforge GitHub into your home directory (/hpi/fs00/home/<username>/
) - Install conda with
bash Mambaforge-Linux-ppc64le.sh
. I recommend to add a prefix to the install location, e.g./hpi/fs00/home/<username>/ppc-mambaforge
instead of/hpi/fs00/home/<username>/mambaforge
. Important: when asked whether to automatically doconda init
, decline. We will do this manually. - If you have no environment setup for
x86_64
or want to follow this guide exactly (recommended), repeat steps 1.-3. with with the installation script forx86_64
and a different prefix, e.g.x86-
.
We need to manually intialize the correct conda
distribution depending on the processor architecture. Paste the following snippet at the bottom of your .bashrc
. You need to replace with your username and adjust the install locations if you deviated from the guide.
Snippet
arch=$(uname -i)
if [[ $arch == x86_64* ]]; then
# echo "Executing x86 (${arch}) Architecture specific part "
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/hpi/fs00/home/<username>/x86-mambaforge/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/hpi/fs00/home/<username>/x86-mambaforge/etc/profile.d/conda.sh" ]; then
. "/hpi/fs00/home/<username>/x86-mambaforge/etc/profile.d/conda.sh"
else
export PATH="/hpi/fs00/home/<username>/x86-mambaforge/bin:$PATH"
fi
fi
unset __conda_setup
# <<< conda initialize <<<
elif [[ $arch == ppc* ]]; then
# echo "Executing POWER (${arch}) Architecture specific part "
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/hpi/fs00/home/<username>/ppc-mambaforge/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/hpi/fs00/home/<username>/ppc-mambaforge/etc/profile.d/conda.sh" ]; then
. "/hpi/fs00/home/<username>/ppc-mambaforge/etc/profile.d/conda.sh"
else
export PATH="/hpi/fs00/home/<username>/ppc-mambaforge/bin:$PATH"
fi
fi
unset __conda_setup
# <<< conda initialize <<<
fi
You can uncomment the echo
lines to verify that the snippet is working correctly. Beware that output in the .bashrc
can cause issues for tools like rsync
or sftp
, so make sure to comment them out again after you are done.
💡 Reload the
.bashrc
withsource ~/.bashrc
for the changes to take effect (or close the terminal and connect again).
For x86_64
, you can continue as you are used to. For ppc64le
please follow this guide.
-
(logged into a
ppc64le
node) Create a new environment and activate it. Supported Python versions are3.7
,3.8
and3.9
(April 2022). Important: the names for environments onx86_64
andppc64le
must not be the same.mamba create -n <cool env name> python=3.8
-
To install packages with a compiled backend, use the https://ftp.osuosl.org/pub/open-ce/current/ channel. You can inspect available packages compiled for
ppc64le
and their versions here.mamba install tensorflow -c https://ftp.osuosl.org/pub/open-ce/current/
💡 To always try to use this channel, run
conda config --prepend channels https://ftp.osuosl.org/pub/open-ce/current/
.Important for PyTorch
PyTorch is a bit tricky. We need to add the
defaults
channel to provide some minor dependencies. The following command was tested and works (April 2022). Just to be safe, installpytorch
as the first package before any others.mamba install pytorch -c https://ftp.osuosl.org/pub/open-ce/current/ -c defaults
Troubleshooting
- There can always be version conflicts or other errrors even with the pre-compiled packages from open-ce. If you have any issues, try Google or the open-ce GitHub for a solution.
- In general, try to install the "biggest" packages (like
pytorch
) first and one-by-one. - Try adding the
-v
flag to a conda command to get verbose output. - If certain packages are not found, try adding channels such as
-c defaults
,-c anaconda
or-c conda-forge
-
To install other packages, you can install them regularly with
mamba install
. I recommend to install pytorch first to avoid version conflicts.
To automatically activate the correct conda
environment on login, paste the following after the previous snippet we pasted into .bashrc
:
Snippet
if [[ $arch == x86_64* ]]; then
conda activate <x86_64 environment name>
else
conda activate <ppc64le environment name>
fi
🍾 Once you're done, give yourself a pad on the back and enjoy your dev environment! 🍾