Install Request: LAMMPS 15th June 2023 release but doing 2nd August 2023 [IN06098486]

Question

Install Request: LAMMPS 15th June 2023 release but doing 2nd August 2023 [IN06098486]

Opened this issue a year ago · 47 comments

The 15th June 2023 release includes for the first time support to output vector style variables during a simulation run which this research group needs.

It looks like the latest version in Spack is 8 Feb 2023.

https://www.lammps.org/download.html

Answer 1 · 2023-08-30T08:59:40.000Z

Ticket now IN:06149543.

Answer 2 · 2023-09-07T14:53:23.000Z

LAMMPS 2nd August 2023 is now the latest release so install this one.

Answer 3 · 2023-09-07T17:17:36.000Z

A build of LAMMPS 2nd August 2023 using GNU compilers and FFTW is now on Myriad and Young using our build scripts method:

module -f unload compilers mpi gcc-libs
module load beta-modules
./lammps-2Aug2023-basic-fftw-gnu_install 2>&1 | tee ~/Software/LAMMPS/lammps-2Aug2023-basic-fftw-gnu_install.log-1

Need to produce a module file and run some tests next.

Answer 4 · 2023-09-11T17:37:42.000Z

I now have the module file on Myriad and Young and have submitted test jobs on both clusters.

Answer 5 · 2023-09-12T15:17:22.000Z

Both the test jobs on Myriad and Young worked. Modules needed for basic GNU FFTW version are:

Myriad

module -f unload compilers mpi gcc-libs
module load beta-modules
module load gcc-libs/10.2.0
module load compilers/gnu/10.2.0
module load numactl/2.0.12
module load binutils/2.36.1/gnu-10.2.0
module load ucx/1.9.0/gnu-10.2.0
module load mpi/openmpi/4.0.5/gnu-10.2.0
module load python3/3.9-gnu-10.2.0
module load fftw/3.3.9/gnu-10.2.0
module load lammps/2aug23/basic-fftw/gnu-10.2.0

Young

module -f unload compilers mpi gcc-libs
module load beta-modules
module load gcc-libs/10.2.0
module load compilers/gnu/10.2.0
module load mpi/openmpi/4.0.5/gnu-10.2.0
module load python3/3.9-gnu-10.2.0
module load fftw/3.3.9/gnu-10.2.0
module load lammps/2aug23/basic-fftw/gnu-10.2.0

Answer 6 · 2023-10-27T11:45:12.000Z

Now working on the GNU + GPU build.

Build script updated and pulled to Young. Needs to be built on a GPU node so job submitted to build LAMMPS 2nd August 2023 GNU+GPU on Young. Build script:

lammps-2Aug2023-gpu-gnu_install

Answer 7 · 2023-10-27T11:54:50.000Z

Build job for LAMMPS 2nd August 2023 GNU+GPU submitted on Myriad as well.

Answer 8 · 2023-10-27T12:25:22.000Z

Both jobs are running.

Answer 9 · 2023-10-30T17:43:07.000Z

CPU build done on Kathleen and test job submitted.

Answer 10 · 2023-10-31T16:59:27.000Z

I've only had time today to check the output from the test job on Kathleen. It looks like it has worked ok.

Answer 11 · 2023-10-31T17:08:45.000Z

Look at:

/home/ccspapp/Software/LAMMPS/tmp.2P0AWpwdjR/lammps-2Aug2023/cmake/presets/most.cmake

for list of LAMMPS packages in our default CPU builds.

Answer 12 · 2023-11-03T16:20:07.000Z

I had to redo the GPU builds on Myriad and Young as I had missed out the FFTW module.

The Myriad build has completed and a job running the GPU unit tests has been submitted.

Young build job is still waiting.

Answer 13 · 2023-11-16T14:02:02.000Z

Test jobs for the GPU build have been submitted on Myriad and Young.

Answer 14 · 2023-11-16T17:26:41.000Z

I've also been trying a build of the basic Intel version but this is failing during compilation:

/dev/shm/ccspapp/lammps/tmp.v9Hhxi8WAq/lammps-stable_2Aug2023/build/_deps/googletest-src/googletest/include/gtest/gtest-matchers.h(434): error: namespace "std" has no member "is_trivially_copy_constructible"
             std::is_trivially_copy_constructible<M>::value &&
                  ^
          detected during:
            processing of template argument list for "testing::internal::MatcherBase<T>::ValuePolicy [with T=const std::string &]" based on template argument <MM> at line 483
            instantiation of "void testing::internal::MatcherBase<T>::Init(M &&) [with T=const std::string &, M=const testing::MatcherInterface<const std::string &> *&]" at line 312
            instantiation of "testing::internal::MatcherBase<T>::MatcherBase(const testing::MatcherInterface<U> *) [with T=const std::string &, U=const std::string &]" at line 536

/dev/shm/ccspapp/lammps/tmp.v9Hhxi8WAq/lammps-stable_2Aug2023/build/_deps/googletest-src/googletest/include/gtest/gtest-matchers.h(434): error: type name is not allowed
             std::is_trivially_copy_constructible<M>::value &&
                                                  ^
          detected during:
            processing of template argument list for "testing::internal::MatcherBase<T>::ValuePolicy [with T=const std::string &]" based on template argument <MM> at line 483
            instantiation of "void testing::internal::MatcherBase<T>::Init(M &&) [with T=const std::string &, M=const testing::MatcherInterface<const std::string &> *&]" at line 312
            instantiation of "testing::internal::MatcherBase<T>::MatcherBase(const testing::MatcherInterface<U> *) [with T=const std::string &, U=const std::string &]" at line 536

/dev/shm/ccspapp/lammps/tmp.v9Hhxi8WAq/lammps-stable_2Aug2023/build/_deps/googletest-src/googletest/include/gtest/gtest-matchers.h(434): error: the global scope has no "value"
             std::is_trivially_copy_constructible<M>::value &&
                                                      ^
          detected during:
            processing of template argument list for "testing::internal::MatcherBase<T>::ValuePolicy [with T=const std::string &]" based on template argument <MM> at line 483
            instantiation of "void testing::internal::MatcherBase<T>::Init(M &&) [with T=const std::string &, M=const testing::MatcherInterface<const std::string &> *&]" at line 312
            instantiation of "testing::internal::MatcherBase<T>::MatcherBase(const testing::MatcherInterface<U> *) [with T=const std::string &, U=const std::string &]" at line 536

compilation aborted for /dev/shm/ccspapp/lammps/tmp.v9Hhxi8WAq/lammps-stable_2Aug2023/build/_deps/googletest-src/googletest/src/gtest-all.cc (code 2)
make[2]: *** [_deps/googletest-build/googletest/CMakeFiles/gtest.dir/src/gtest-all.cc.o] Error 2
make[2]: Leaving directory `/dev/shm/ccspapp/lammps/tmp.v9Hhxi8WAq/lammps-stable_2Aug2023/build'
make[1]: *** [_deps/googletest-build/googletest/CMakeFiles/gtest.dir/all] Error 2
make[1]: Leaving directory `/dev/shm/ccspapp/lammps/tmp.v9Hhxi8WAq/lammps-stable_2Aug2023/build'
make: *** [all] Error 2

Using Intel 2020 compilers.

Answer 15 · 2023-11-17T09:45:25.000Z

I would use compilers/intel/2022.2 and not 2020 for anything (because of newer gcc underneath).

Answer 16 · 2023-11-17T12:31:54.000Z

Updated Intel build to use gcc-libs/10.2.0 and Intel 2022.2:

module -f unload compilers mpi gcc-libs
module load beta-modules
BUILD_UNIT_TESTS=yes ./lammps-2Aug2023-basic_install 2>&1 | tee ~/Software/LAMMPS/lammps-2Aug2023-basic_install.log-2

Answer 17 · 2023-11-17T13:09:53.000Z

The test jobs for the GNU + GPU version have run successfully on Myriad and Young.

Answer 18 · 2023-11-17T14:12:28.000Z

The basic Intel build on Myriad completed without errors using Intel 2022.2 compilers. It will need testing now.

Answer 19 · 2023-11-21T15:15:23.000Z

I've submitted a test job for the basic Intel version on Myriad.

Answer 20 · 2023-11-21T16:11:12.000Z

The LAMMPS 2nd August 2023 basic Intel version test job runs on Myriad. I'm now going to build this version on Kathleen and Young.

Answer 21 · 2023-11-21T17:10:36.000Z

The builds on Kathleen and Young have finished. Will now need to check for errors and run a 2 node or bigger test job.

Answer 22 · 2023-11-22T09:56:35.000Z

Two node test job for the basic Intel version submitted on Kathleen.

Answer 23 · 2023-11-22T10:09:28.000Z

Two node test job for the basic Intel version submitted on Young.

Answer 24 · 2023-11-22T16:22:22.000Z

The Kathleen job has been running for 6 hours (set for about 12). The Young one is still queueing.

Answer 25 · 2023-11-23T11:50:50.000Z

Both jobs finished overnight and look ok. The Kathleen one was a bigger job and did 20,000 in about 8 hours and the smaller Young one 2000 steps in 48 minutes. I'll upload a module file for the basic Intel version.

Answer 26 · 2023-11-23T17:42:23.000Z

module file updated and loaded onto Kathleen, Myriad and Young.

Answer 27 · 2023-11-23T17:43:35.000Z

To use LAMMPS 2nd August 2023 version basic Intel build you need the following modules:

module -f unload compilers mpi gcc-libs
module load beta-modules
module load gcc-libs/10.2.0
module load compilers/intel/2022.2
module load mpi/intel/2019/update6/intel
module load python/3.9.10
module load lammps/2aug23/basic/intel-2022.2

Answer 28 · 2023-11-24T17:42:31.000Z

Doing the build with the INTEL package next. On Kathleen first:

module -f unload compilers mpi gcc-libs
module load beta-modules
./lammps-2Aug2023-INTEL_install 2>&1 | tee ~/Software/LAMMPS/lammps-2Aug2023-INTEL_instal.log

Answer 29 · 2023-11-24T21:05:15.000Z

The INTEL build on Kathleen has completed without errors.

Answer 30 · 2023-11-27T17:09:26.000Z

I have a test job submitted for the INTEL build on Kathleen.

Answer 31 · 2023-11-27T17:53:35.000Z

It has started to run:

----------------------------------------------------------
Using INTEL Package without Coprocessor.
Compiler: Intel Classic C++ 20.21.6 / Intel(R) C++ g++ 10.2 mode
SIMD compiler directives: Enabled
Precision: mixed

waiting to see how it runs overnight - long test run with 20,000 steps.

Answer 32 · 2023-11-28T12:29:14.000Z

Job ran to completion and the speed up is quite good. 3 hours 15 minutes for the INTEL package version with about 8 hours for the basic Intel build.

Answer 33 · 2023-11-28T12:30:16.000Z

now to build the Intel variant on Young.

Answer 34 · 2023-11-28T16:19:54.000Z

build on Young finished with out errors. Test job submitted.

Answer 35 · 2023-11-28T18:16:44.000Z

Test job is still queuing so I will check results tomorrow.

Answer 36 · 2023-11-29T16:15:24.000Z

The job failed because I made a mistake in my job script. I've corrected it and re-submitted the job.

Answer 37 · 2023-11-29T17:43:45.000Z

I'm getting the build script for the Intel GPU variant ready to submit as a job on Young from ccspapp.

Answer 38 · 2023-11-29T18:01:37.000Z

Build job for the Intel GPU variant submitted. Job script is:

/home/ccspapp/Software/LAMMPS/build-intel-gpu-2Aug2023.sh

Answer 39 · 2023-11-30T12:00:03.000Z

Test job of the INTEL package variant worked this time. Took about 20 minutes to run as opposed to 48 minutes for the basic Intel variant.

Answer 40 · 2023-11-30T14:43:39.000Z

The module file for the 2nd August 2023 version INTEL package variant has been uploaded to Kathleen and Young. To use the INTEL package variant the following module commands are needed:

module -f unload compilers mpi gcc-libs
module load beta-modules
module load gcc-libs/10.2.0
module load compilers/intel/2022.2
module load mpi/intel/2019/update6/intel
module load python/3.9.10
module load lammps/2aug23/userintel/intel-2022.2

Answer 41 · 2023-11-30T17:26:52.000Z

The Intel GPU build job ran overnight but failed with:

      Options:       -xHost;-fp-model;fast=2;-no-prec-div;-qoverride-limits;-diag-disable=10441;-diag-disable=2196
In file included from /shared/ucl/apps/cuda/11.3.1/gnu-10.2.0/include/cuda_runtime.h(83),
                 from /home/ccspapp/Scratch/lammps/2Aug2023/gpumixed/tmp.ZQuVwdmCED/lammps-stable_2Aug2023/lib/gpu/lal_zbl.cu(0):
/shared/ucl/apps/cuda/11.3.1/gnu-10.2.0/include/crt/host_config.h(110): error: #error directive: -- unsupported ICC configuration! Only ICC 15.0, ICC 16.0, ICC 17.0, ICC 18.0 and ICC 19.x on Linux x86_64 are supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
  #error -- unsupported ICC configuration! Only ICC 15.0, ICC 16.0, ICC 17.0, ICC 18.0 and ICC 19.x on Linux x86_64 are supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
   ^

CMake Error at cuda_compile_fatbin_1_generated_lal_zbl.cu.fatbin.RelWithDebInfo.cmake:212 (message):
  Error generating
  /home/ccspapp/Scratch/lammps/2Aug2023/gpumixed/tmp.ZQuVwdmCED/lammps-stable_2Aug2023/build/cuda_compile_fatbin_1_generated_lal_zbl.cu.fatbin


make[2]: *** [cuda_compile_fatbin_1_generated_lal_zbl.cu.fatbin] Error 1
make[1]: *** [CMakeFiles/gpu.dir/all] Error 2
make: *** [all] Error 2

will need to investigate tomorrow now.

Answer 42 · 2023-12-01T15:28:48.000Z

Switched to using CUDA 11.8.0 instead of 11.3.1. I had to install this version first as it wasn't on Young. The build has finished with out errors so I'm running a test job next.

Answer 43 · 2023-12-01T16:50:01.000Z

Intel GPU variant test job submitted.

Answer 44 · 2023-12-04T17:59:56.000Z

My test job failed because I hadn't got the module loads correct. I've now re-submitted it.

Answer 45 · 2023-12-12T15:06:39.000Z

The Intel GPU variant test job has failed with MPI errors:

GERun: GErun command being run:
GERun:  mpirun --rsh=ssh -machinefile /tmpdir/job/1211349.undefined/machines.unique -np 16 -rr lmp_gpu -sf gpu -pk gpu 1 -in in.lj
Assertion failed in file ../../src/util/intel/shm_heap/impi_shm_heap.c at line 917: group_id < group_num
Assertion failed in file ../../src/util/intel/shm_heap/impi_shm_heap.c at line 917: group_id < group_num
Assertion failed in file ../../src/util/intel/shm_heap/impi_shm_heap.c at line 917: group_id < group_num
Assertion failed in file ../../src/util/intel/shm_heap/impi_shm_heap.c at line 917: group_id < group_num
Assertion failed in file ../../src/util/intel/shm_heap/impi_shm_heap.c at line 917: group_id < group_num
Assertion failed in file ../../src/util/intel/shm_heap/impi_shm_heap.c at line 917: group_id < group_num
Assertion failed in file ../../src/util/intel/shm_heap/impi_shm_heap.c at line 917: group_id < group_num
Assertion failed in file ../../src/util/intel/shm_heap/impi_shm_heap.c at line 917: group_id < group_num
Assertion failed in file ../../src/util/intel/shm_heap/impi_shm_heap.c at line 917: group_id < group_num
Assertion failed in file ../../src/util/intel/shm_heap/impi_shm_heap.c at line 917: group_id < group_num
Assertion failed in file ../../src/util/intel/shm_heap/impi_shm_heap.c at line 917: group_id < group_num
Assertion failed in file ../../src/util/intel/shm_heap/impi_shm_heap.c at line 917: group_id < group_num
Assertion failed in file ../../src/util/intel/shm_heap/impi_shm_heap.c at line 917: group_id < group_num
/shared/ucl/apps/intel/2020/impi/2019.6.166/intel64/lib/release/libmpi.so.12(MPL_backtrace_show+0x34) [0x2b286b6e31d4]
/shared/ucl/apps/intel/2020/impi/2019.6.166/intel64/lib/release/libmpi.so.12(MPIR_Assert_fail+0x21) [0x2b286ae6b031]
/shared/ucl/apps/intel/2020/impi/2019.6.166/intel64/lib/release/libmpi.so.12(+0x44c505) [0x2b286b1ac505]
/shared/ucl/apps/intel/2020/impi/2019.6.166/intel64/lib/release/libmpi.so.12(+0x7e9b0c) [0x2b286b549b0c]
/shared/ucl/apps/intel/2020/impi/2019.6.166/intel64/lib/release/libmpi.so.12(+0x64cd70) [0x2b286b3acd70]
/shared/ucl/apps/intel/2020/impi/2019.6.166/intel64/lib/release/libmpi.so.12(+0x1fe5fa) [0x2b286af5e5fa]
/shared/ucl/apps/intel/2020/impi/2019.6.166/intel64/lib/release/libmpi.so.12(+0x4664b4) [0x2b286b1c64b4]
/shared/ucl/apps/intel/2020/impi/2019.6.166/intel64/lib/release/libmpi.so.12(MPI_Init+0x11b) [0x2b286b1c1c7b]
lmp_gpu() [0x402622]

Answer 46 · 2023-12-14T17:44:06.000Z

I'me beginning to build the non-GPU variants on Michael now:

basic Intel;
INTEL package variant;
GNU + FFTW variant.

Answer 47 · 2023-12-20T17:37:26.000Z

All thats left to do now is add the missing variants from Myriad when the cluster is restored to service.