joezuntz/cosmosis

Shared setup in MPI

vittoriodx7 opened this issue · 1 comments

Hello,
I am currently adopting some code to cosmosis and I love it. However, I am not sure to understand a few things correctly regarding MPI and memory usage of setup functions.

In practice, imagine a likelihood that uses a lot of memory during setup and then uses much less memory during execute phase because e.g. a complicated N-dimensional calculation in the former.
In previous version of my code, using mpi4py, I was doing the memory intensive calculation only on one CPU and then share the much lighter results to the others in such a way that the overall memory usage of my code was remaining contained. In cosmosis I was wondering if such a possibility was available, if so what to setup in the pipeline to make that happen.

e.g. in my old version it was something like::

from mpi4py import MPI
if MPI.COMM_WORLD.Get_size() > 1:
    use_MPI = True
    mpi_comm = MPI.COMM_WORLD
    mpi_rank = mpi_comm.Get_rank()
    mpi_size = mpi_comm.Get_size()
if use_MPI:
    mpi_comm.Barrier()
    if mpi_rank == 0:
        heavy_stuff = calc()
        for i in range(1, mpi_size):
            mpi_comm.send(heavy_stuff, dest = i, tag = 100 + i)
    if mpi_rank > 0:
        heavy_stuff = mpi_comm.recv(source = 0, tag = 100 + mpi_rank)
    mpi_comm.Barrier()

Hi @vittoriodx7 - that's a very interesting question.

I've tested and it's possible to do exactly what you're doing above in the setup function of a module. It won't interfere with the MPI use later in the sampling because the pipeline setup is done before the MPI pool is created.

So yes, give it a go.

Incidentally, you can use:

if mpi_rank == 0:
    heavy_stuff = calc()
    mpi_comm.bcast(heavy_stuff, root=0)
else:
    heavy_stuff = mpi_comm.bcast(None, root=0)

to replace your loop of sends and recvs - it should be more efficient.