GeoscienceAustralia/anuga_core

run the code on multi-node platform

Dongxueyang opened this issue · 7 comments

Hi @stoiver :

I want to run a large simulation. To improve the efficiency of calculation. I want to run the program on a supercomputing platform, and use multi-nodes. So if the codes support multi-node computing mode?

@Dongxueyang yes anuga can run on multinode suprcomputers. Parallelisation is implemented via MPI. The python 2 version has been extensively run in parallel on the NCI (raijin). I haven't as yet tried it on gadi. It uses the pypar mpi python wrapper.

We are just moving over to using python 3. We seem to have a working version which uses mpi4py as the MPI python wrapper. It would be great if you could test out the python 3 version. I will push it over to the GA git repository (branch anuga_py3).

@stoiver That is great. Thank you so much. I can try to download the version of anuga_py3 and try to use on a multinode suprcomputers(with mpi4py). So can I get the branch anuga_py3 now? Where can I download and test?

@Dongxueyang You can use the anuga_py3 branch of the anuga_core repository. Might be best to clone a new copy of anuga_core and add the branch. Ie

git clone -b anuga_py3 https://github.com/GeoscienceAustralia/anuga_core.git

You can get a hint at which python libraries to install by looking at the shell scripts in the tools directory in downloaded repository.

Hi @stoiver. If I want to run a simulation on a multi-node platform.(use two nodes and 24 cores (12 cores/node))
I use this command:
mpirun -machinefile machinefile -np 24 python test.py
and the machinefile:

node1_id
node2_id

Is the command right?
If it is wrong. how to run the simulation on two nodes (24cores)?

Thansk a lot. Hope your reply.
Dong

@stoiver
Did you see the question above and could you give me some advice. I try to run the example/simple_examples/channel3_parallel.py on two nodes (48cores). But I can not run the simulation.

Dong

@Dongxueyang you need to setup mpi to run on your 24 cores. THis would depend on whether you are using openmpi or mpich. Do you have a system admin person for your system? You should be able to setup your mpirun command to run by default on your two nodes. I recall when working on a cluster a few years ago that you need to ensure you can automatically log into the two nodes using ssh keys. But as suggested, get help from you system admin.

@stoiver Ok, thanks I use openmpi on the cluster. I try to ask the system admin firstly. Thanks a lot.
And I want to know I must install the same openmpi on every nodes, right?