dalel487/su2hmc

Non-MPI performance

Closed this issue · 2 comments

Even if a particular dimension of the lattice isn't divided across MPI ranks, it is still doing halo exchanges to itself. This increases the (admittedly small) memory footprint of the programme and wastes time on memory bound tasks. This will be even more critical in a GPU environment where we may not be parallelising across all lattice dimensions.

Most of the halo exchanges have been removed. Not going to get much more performance with the rest killed off. We'll come back to this later when the GPU needs it

All halo exchanges are removed, and if compiling with nproc=1 in sizes.h the complier does not compile any calls to MPI functions.