https://www.doc.ic.ac.uk/lab/cplus/cstyle.html
make DEFINES=-DDEBUG all
mpirun -n 4 ./a.out 1 1 -r 2 -c 2
paste logs/*.log | less
Arbitrary shared-memory comm (sm_comm
) split where each sm_comm
handles its own Send/Recv
.
-
Arbitrary
sm_comm
split.+-----+-----+-----+ '+' denotes boundary of single proc | 1 | 2 | +--> p_col +-----+-----+-----+ | | 2 | 3 | V +-----+-----+-----+ p_row
-
Leader in each
sm_comm
manages transfers to/from othersm_comm
.Since each
sm_comm
might not have the same relative domain in the cartesian grid, transfers will get very messy.In above example, during
Send/Recv
to the right,sm_comm_1
has to compute the relevant subarrays ofsm_comm_2
andsm_comm_3
. -
Each proc of
sm_comm
manages its own transfer by determining whether its neighbour is internal or external tosm_comm
.In above example, during
Send/Recv
to the right, first proc insm_comm_1
only sees one proc insm_comm_2
, while second proc insm_comm_1
only sees one proc insm_comm_3
.
Are there benefits/drawbacks for splitting
sm_comm_2
into 2 sub-communicators that don't crossoverp_col
? YES- 2D/3D array indexing will be extremely difficult to handle
-
-
Enforce size of
sm_comm
to either a) dividep_col
or b) be divisible byp_col
.a) b) +-----+-----+ +-----+-----+ | 1 | | 1 | 2 | +-----+-----+ +-----+-----+ | 2 | | 3 | 4 | +-----+-----+ +-----+-----+
This can be achieved by:
- Ensuring
p_col
is set accordingly withnCPU per node
on particular device to satisfy above conditions.- Will need to make sure entire nodes are occupied (i.e. don't share a node with other jobs)
- Ensuring procs on a node are allocated to the same
p_col
. Options:-
Block-allocate the processes, i.e.
(0-23)
onnode 0
,(24-47)
onnode 1
, ...The proc ranks (in
MPI_COMM_WORLD
) on a node SHOULD by default ensureMPI_Cart_create(...,reorder=1,...)
aligns with the shared memory communicator. -
Deal-allocate the processes, i.e.
(0,n,2n,...)
onnode 0
,(1,n+1,2n+1,...)
onnode 1
, ...Same as above? Or split communicator by shared memory, then create adjacency matrix for
MPI_Graph_create
to ensure they are aligned inp_col
.
-
- Ensuring