tests fail if only 1 GPU on compute node
BenWibking opened this issue · 1 comments
BenWibking commented
When building with cmake .. -DAMReX_ENABLE_TESTS=ON -DAMReX_GPU_BACKEND=HIP
and running ctest
on a GPU development node with only 1 GPU, the following tests fail:
The following tests FAILED:
24 - Particles_ParallelContext_3d (Failed)
30 - Particles_Redistribute_3d (Failed)
31 - Particles_RedistributeSOA_3d (Failed)
Errors while running CTest
This appears to be because these tests require 2 MPI ranks, and each MPI rank tries to use the GPU (the same one, since there is only 1 on this node), which fails:
Start 24: Particles_ParallelContext_3d
1/1 Test #24: Particles_ParallelContext_3d .....***Failed 4.26 sec
Initializing AMReX (24.06-6-g0da4d8b7e657)...
MPI initialized with 2 MPI processes
MPI initialized with thread support level 0
Initializing HIP...
There are more MPI processes than the number of GPUs.!
HIP initialized with 1 device.
AMReX (24.06-6-g0da4d8b7e657) initialized
Running redistribute test
0::Assertion `do_tiling == false' failed, file "/home/bwibking/amrex/Src/Particle/AMReX_ParticleContainerI.H", line 1256 !!!
SIGABRT
1::Assertion `do_tiling == false' failed, file "/home/bwibking/amrex/Src/Particle/AMReX_ParticleContainerI.H", line 1256 !!!
SIGABRT
See Backtrace.0 file for details
See Backtrace.1 file for details
I'm not sure how best to avoid this problem. Maybe a warning could be printed if this is not expected to work in this scenario?