GRTLCollaboration/GRChombo

Deadlocks when using non-blocking collectives with OpenMPI

mirenradia opened this issue · 0 comments

Several users (including myself) have sometimes encountered deadlocks when using OpenMPI that seems to stem from the non-blocking MPI collectives in the AMRInterpolator and is resolved by the changes in this commit. However the issue does not always occur and there may be other factors at play.

In my experience the deadlock doesn't seem to occur straight away but rather at the next MPI collective call after the first MPI_Waitall in MPIContext::asyncEnd() whether that be in writing an HDF5 file or the next use of the AMRInterpolator.

I have experienced this problem with OpenMPI 4.0.5.