This is a library that transforms a matrix between two arbitrary grid-like data layouts. By layout, we mean the way in which a matrix is distributed over MPI ranks.
For example, imagine a matrix split into four blocks, such that each block resides on a different rank as shown below:
+---------------------+
| | |
| | |
| 0 | 1 |
| | |
| | |
+---------------------+
| | |
| 2 | 3 |
| | |
+---------------------+
Assume that we want to redistribute the matrix, to achieve the following data layout:
+---------------------+
| 0 | 2 | 1 |
| | | |
+---------------------+
| 1 | 0 | 1 |
| | | |
+---------------------+
| | | |
| 2 | 2 | 0 |
| | | |
+---------------------+
This is exaclty what this library is doing!
- initial and final layouts do not have to be block-cyclic, but any grid-like data layout.
- blocks do not have to be all the same dimensions, but can have different dimensions.
- a rank might own arbitrary number of blocks
- the type of entries in a matrix can be arbitrary
- blocks that belong to the same rank do not have to be consecutive in memory
- each block can have an arbitrary stride
- OpenMP support
When tested against ScaLAPACK (as provided by Intel MKL), this library outperforms the MKL's pdgemr2d
routine that transforms between two block-cyclic data layouts by up to 50%, as shown below:
The pipeline of the algorithm is roughly the following:
- Find the intersections of the initial and final grids (called grid cover) in one pass, which decomposes initial blocks into smaller blocks.
- Sort decomposed initial blocks based on the rank to which they should be sent. Thus, if some ranks should exchange more than one block, they will do it within a single message.
- Copy decomposed intial blocks from a local storage (with arbitrary local data layout) to a temporary MPI buffer.
- Perform MPI_Alltoallv.
- Use the computed grid cover to decompose blocks from the final grid into smaller blocks.
- Sort decomposed final blocks based on the receiver rank.
- All smaller blocks that are received are copied from a temporary MPI buffer to the local storage (with arbitrary local data layout).
Assuming that you want to use the gcc 8
compiler and OpenMP
, you can build the project as follows:
# clone the repo
git clone --recursive https://github.com/kabicm/grid2grid
cd grid2grid
mkdir build
cd build
# build
CC=gcc-8 CXX=g++-8 cmake -DCMAKE_BUILD_TYPE=Release -DWITH_OPENMP=TRUE ..
# compile
make -j 4
To start with, there is a small example that transforms the matrix between two block-cyclic layouts. This example can be run from build
directory as follows:
mpirun --oversubscribe -np 4 ./examples/scalapack2scalapack -m 10 -n 10 -ibm 2 -ibn 3 -fbm 3 -fbn 5 -pm 2 -pn 2
Where flags have the following meaning:
(m, n)
: Dimensions of matrix that we want to change the layout for.(ibm, ibn)
: Dimensions of initial blocks, determining the initial block-cyclic distribution.(fbm, fbn)
: Dimensions of final blocks, determining the final block-cyclic distribution that we want to reach.(pm, pn)
: Processor grid, determining the processor decomposition for the block-cyclic distribution.
To transform between two arbitrary grid-like data layouts, we need to construct two grid_layout
objects, one describing the initial layout and one describing the final layout. After that, we just invoke:
grid2grid::grid_layout initial_layout(...);
grid2grid::grid_layout final_layout(...);
grid2grid::transform(initial_layout, final_layout, MPI_COMM_WORLD);
In order to create a grid_layout object, we need to provide the following information:
grid2D
: small struct describing the grid, basically 2 vectors, one describing where rows are split and one describing where columns are split in a grid.owners
: a matrix that specifies the owner (i.e. the rank) of each block ingrid2D
.local_blocks
: a vector of blocks that current rank owns. Each block is defined with a pointer to the beginning of that block in the local memory of current rank, a stride (by default equal to the number of rows of a block) and dimensions.
This is done for ScaLAPACK
block-cyclic data layout, which can serve as an example of how grid_layout
object can be created.
Marko Kabic (marko.kabic@cscs.ch)