SX-Aurora/Vftrace

Non blocking MPI communication handling frees buffers and other temporary arrays prematurely

SpinTensor opened this issue · 1 comments

In the f2c-routines memory is allocated to hold temporary copies of arrays like counts, displacements, types, etc.

int *c_recvcounts = NULL ;
...
c_recvcounts = (int*) malloc(size*sizeof(int));
for (int i=0; i<size; i++) {
   c_recvcounts[i] = (int) f_recvcounts[i];
}

They are freed in the same routine right after the call to MPI, which is not allowed.
The pointers need to be stored as part of the vftr_request structure and only freed upon completion of the communication.

One way would be to give the vftr_mpi routines a pointer-array where all the pointers are stored, together with an int, of how many pointers. this pointer array is then copied and stored in the vftr-request structure until a Wait* or sucessfull Test* was performed (or the internal completion was recognized). Routines which don't need temporary arrays simply pass 0 and NULL.

As far as I can see at the moment the only routines who would need this feature are:

  • iallgatherv
  • ialltoallv
  • ialltoallw
  • igatherv
  • ireduce_scatter
  • iscatterv

TODO-List:

  • The F08-wrapper routines currently only available in the F08_MPI_Functionality branch need the vftr_no_mpi_logging switch, as they will be removed from the vftr_mpi routines!
  • The internal collective sync regions need to be moved to the wrappers. If MPI-logging is disabled but synchronization estimation is active the sync region will never be reached, as soon as the no-loggin switch is moved to the wrapper.
  • The C-wrappers need to make copies of the arrays in order for the vftr_mpi routines getting the same kinds of data independent of F or C.

except for the F08-functionality branch everything should be done.