Install Request: LAMMPS + GPU + KOKKOS for Young

Question

Install Request: LAMMPS + GPU + KOKKOS for Young

Closed this issue 4 months ago · 6 comments

It appears that there is not a GPU version of LAMMPS on Young yet - is that correct? If so, I would like to renew this request for a GPU version of LAMMPS. The acceleration comes from using the KOKKOS package, which is most easily turned on (if downloading from the 29 Sep 2021 update 2 github branch of lammps) via adding this to a build script:

# modify the kokkos cmake recipe for the CUDA architecture of Young A100's
sed -i 's/MAXWELL50/AMPERE86/g' $LAMMPS_BUILD/cmake/presets/kokkos-cuda.cmake
sed -i 's/PASCAL60/AMPERE86/g' $LAMMPS_BUILD/cmake/presets/kokkos-cuda.cmake
sed -i 's/VOLTA70/AMPERE86/g' $LAMMPS_BUILD/cmake/presets/kokkos-cuda.cmake
sed -i 's/TURING75/AMPERE86/g' $LAMMPS_BUILD/cmake/presets/kokkos-cuda.cmake
sed -i 's/AMPERE80/AMPERE86/g' $LAMMPS_BUILD/cmake/presets/kokkos-cuda.cmake

Add to the cmake line: -C $LAMMPS_BUILD/../cmake/presets/kokkos-cuda.cmake -D CMAKE_CXX_COMPILER=$LAMMPS_BUILD/lib/kokkos/bin/nvcc_wrapper

However, the KOKKOS appears to require libtorch rather than pytorch - don't ask me why - which can be downloaded:

# this is 1.11.0 and CUDA 11.3
wget https://download.pytorch.org/libtorch/cu113/libtorch-cxx11-abi-shared-with-deps-1.11.0%2Bcu113.zip 
unzip -q libtorch-cxx11-abi-shared-with-deps-1.11.0+cu113.zip

Other packages that would be very helpful: PLUMED, REAXFF, and patches for pair_allegro, pair_nequip

Answer 1 · 2023-06-19T08:33:48.000Z

@apoletayev There is a GNU and an Intel LAMMPS GPU version - type module load beta-modules and then module avail lammps to see them.

lammps/29sep21up2/gpu/gnu-10.2.0
lammps/29sep21up2/gpu/intel-2020

(We'd be doing newer builds using Spack: https://spack.readthedocs.io/en/latest/package_list.html#lammps and it looks like they have Kokkos available as a variant. We are starting to put together our initial Spack deployment of software).

Answer 2 · 2023-07-22T10:52:14.000Z

Thank you for this response. I thought it best to not bug you while you were working on the re-animation of Young last month. Coming back to this, it looks like the level of customization of the LAMMPS build that I need is not something you can support, and I will need to build my own LAMMPS with the machine-learning patches I will need. Could you help me do this faster? Two

What version of CUDA do the A100's run? I would assume it is 8.6, which would turn on with AMPERE86 in the cmake files. This is architecture-specific and key to configuring the KOKKOS installation.
Could you share any example shell scripts that you use to download and build software such as earlier versions of LAMMPS? The parts that deal with loading architecture-specific dependencies such as compilers are particularly helpful. I ask because I tried to load dependencies for a lammps build and it was a mess asking me to un-load modules and load modules, and in the end I was not able to simultaneously load the modules for building LAMMPS.

Thank you, -Andrey

Answer 3 · 2023-07-24T16:32:04.000Z

CUDA, see https://www.rc.ucl.ac.uk/docs/Supplementary/Young_GPU_Nodes/#cuda-versions. CUDA 11.2 matches the driver version at the moment.

In this repo, look at https://github.com/search?q=repo%3AUCL-RITS%2Frcps-buildscripts%20lammps&type=code when logged in to GitHub to see all the buildscripts for the versions we currently have installed. They include the modules - and you can also see these on the cluster by typing module show followed by a lammps module.

Start with module unload -f compilers mpi gcc-libs (or module purge if you want to get rid of everything).

Answer 4 · 2023-07-24T16:38:13.000Z

If you meant Compute Capability, we usually set that as 80 for the A100s.

Answer 5 · 2023-07-24T17:10:33.000Z

Yes, I meant compute capability. AMPERE80 sounds like the first thing I'll try. I will also look at the example buildscripts, that's exactly what I sought. Thank you!

Answer 6 · 2024-07-09T12:34:24.000Z

Sorry to have never updated this. The install with GPU module works and I get very similar performance on Young as I do on the same GPUs on a different cluster with KOKKOS. So I have abandoned compiling KOKKOS from scratch because 😭 . Happy to share a build script if that helps.