running athena in parallel
you can speed up your simulations by running athena in parallel, basically splitting up your simulation domain into N equal-sized chunks and then running each one on a separate processor. on each of these chunks, the boundary conditions come from the simulation running on the neighboring processor. athena uses something called MPI (message passing interface) to communicate information among processors. i would recommend not looking into the details of MPI unless you have to.
configuring and compiling
to make use of multiple processors, you first have to re-compile athena with MPI enabled
$ ./configure --with-gas=hydro --with-problem=... --enable-mpi $ make all
(you may need to install MPI before doing this… if which mpirun
doesn’t give you anything, try $ brew install open-mpi
)
setting up the simulation
next, you need to tell athena how many processors to use. this
happens via the NGrid_x* entries in the athinput
file. for
example:
<domain1> level = 0 Nx1 = 8192 x1min = -2.5e3 x1max = 2.5e3 bc_ix1 = 4 bc_ox1 = 4 Nx2 = 8192 x2min = -2.5e3 x2max = 2.5e3 bc_ix2 = 4 bc_ox2 = 4 Nx3 = 1 x3min = -0.5 x3max = 0.5 bc_ix3 = 4 bc_ox3 = 4 NGrid_x1 = 16 NGrid_x2 = 32 NGrid_x3 = 1
this runs a (1892)^2 simulation on 16*32 = 512 processors. i’ve
used this for some recent simulations on a supercomputer. you
won’t have 512 processors on your laptops, so maybe try
NGrid_x1 = NGrid_x2 = 2
to begin with.
when you chose your own parameters, remember that the “surface” of cells on your domain need to be communicated as boundary conditions at the interfaces among processors. communication has to happen at every time-step, and can easily become a bottleneck if you’re not careful. so you don’t want to make the sub-domains too small, and you want to make them as “square” as possible to minimize the surface area to volume ratio.
at the same time, super-computers are organized into hierarchical
“nodes,” each of which contains some number of processors. on the
stampede
computer, each node contains 16 processors, so we need
the total number we ask for to be an integer multiple of 16.
(if you know the distinction between cores and processors, i’m really talking about cores here.)
running your simulation
to run your simulation, you need to do it through MPI:
$ mpirun -np 4 ./athena -i athinput
again, it’s best not to go into the details of what happens
here… you shouldn’t need it. this example runs on 4 processors,
which needs to be the same as the total number you ask for in the
athinput
file from the last section.
you’ll notice that athena makes four new directories, id0
, id1
,
id2
, and id3
. each one contains vtk
files with data from its
corresponding processor. in order to read these files into
mathematica, you first need to join these files.
joining VTK files
athena provides a small c program for joining vtk files. i’ve copied it in this directory. to use it, first compile it:
$ gcc -o join_vtk.x join_vtk.c -lm
this makes an executable called join_vtk.x
. you can run it
directly, but i’ve made a wrapper program called join-vtk.rb
which should make it easier for you. in the simulation directory,
run
$ ruby join-vtk.rb
and it will make a directory called merged
with all of the merged
vtk files. you can go ahead an analyze these files in the same way
you’ve been working with the files from your single-processor
(“serial”) simulations.
restart files
supercomputers usually use a kind of time-share system… you don’t get to run your job directly on the computer. instead, you submit instructions for running it and request the resources to do so. it goes into what’s called a “queue,” and some software schedules all the jobs to run when the resources become available.
in order to keep you from hogging the machine, supercomputers usually only allow you to run your simulations for some fixed length of time. after that, it kills your job and you have to resubmit it and return to the back of the line.
so you’ll need to be able to use “restart” files, which periodically save the simulation grid to the hard drive and enable you to pick up the simulation where you left off.
setting up the simulation
first, you need to ask athena to give you restart files. this is very similar to the VTK files you know about:
<output3> out_fmt = rst dt = 1.0e3
fixing the code
remember the problem()
function that you work on to define your
simulation? that has a couple of jobs:
- read in parameters from the athinput file (eg,
par_get_d()
) - set up the initial condition by writing values to the grid
- set up special boundary conditions, turn on gravity, history outputs, etc.
the function problem()
gets called before the simulation
starts. there’s an analogous function problem_read_restart()
which gets called before a restart simulation begins. you need to
do all of the same things here as in the problem()
function
except #2. since the simulation is already in progress, you
don’t want to overwrite the data with your initial condition!
everything else needs to be exactly the same in both problem()
and read_restart()
. (eg, if you change the order of the history
functions, the order of the columns with change halfway through the
file and it becomes really hard to read.) i will often make a
function set_vars()
and call it from both functions.
DANGER: one last thing to think about. if you have some global
variable that you use in userwork_in_loop
or anything else, this
also needs to be communicated to the restarted simulation. to do
this, you just stick it to the end of the restart file using
write_restart()
. for example, if you use variables x_shift
and
vflow
to keep track of the shock interface and try to follow it
in your simulations, you need to save these as well. you just
simply write them to the end of the file:
void problem_write_restart(MeshS *pM, FILE *fp)
{
#ifdef FOLLOW_SHOCK
fwrite(&x_shift, sizeof(Real), 1, fp);
fwrite(&vflow, sizeof(Real), 1, fp);
#endif
return;
}
you then read these back in the read_restart()
function:
void problem_read_restart(MeshS *pM, FILE *fp)
{
int nl,nd;
/* set boundary conditions. since this function takes a Mesh, you
need to "find" the Domain */
for (nl=0; nl<(pM->NLevels); nl++) {
for (nd=0; nd<(pM->DomainsPerLevel[nl]); nd++) {
if (pM->Domain[nl][nd].Disp[0] == 0)
bvals_mhd_fun(&(pM->Domain[nl][nd]), left_x1, bc_ox1);
}
}
/* par_getd statements copied from problem() */
/* dump_history_enroll statements copied from problem() */
/* gravity, etc. statements copied from problem() */
/* DANGER: make sure the order here matches the order in write_restart() */
#ifdef FOLLOW_SHOCK
fread(&x_shift, sizeof(Real), 1, fp);
fread(&vflow, sizeof(Real), 1, fp);
#endif
return;
}
running and restarting your simulations
once you have this all set-up, you can run athena with a time constraint:
$ ./athena -t 00:02:00
the format is hh:mm:ss
, so the above should quit after two
minutes. in each of the idn/
directories, you should find a
restart file:
$ ls id0/*.rst id0/sim.0000.rst id0/sim.0001.rst id0/sim.0002.rst
to restart the simulation, you want to take all of the last rst files and move them to the simulation directory:
$ find id* -name '*.0002.rst' | xargs -I % mv % ./
now, you can restart the simulation:
$ ./athena -t 00:02:00 -r sim.0002.rst
before running anything on the supercomputer, you should always
test that the restarts work on your laptops. run one
low-resolution simulation all the way through, then run another
restarting it once or twice. at the end, the outputs should be
identical – if there’s any difference, it indicates some bug in
the read_restart()
function that needs to be fixed.
you should also do this for a couple of different processor configurations (2x1, 1x2, 2x2, and serial) to make sure nothing funny happens at the internal MPI boundaries. it’s much easier to catch these errors running small tests on your laptops than to try and deal with them on the supercomputer.
batch system on supercomputers
once you know your code works with MPI and with restarts, you’re ready to run on the supercomputer!
as i mentioned earlier, you don’t get to use the computer directly. instead, you log into a separate computer called the “login node.” on this computer you prepare a little script with instructions for running your simulation and then submit it to the queue.
for example, to run a simulation of 512 processors for 12 hours, you might use:
#!/bin/bash #SBATCH -J SOME_NAME_FOR_THE_JOB #SBATCH -o myMPI.o%j #SBATCH -n 512 # total number of processors requested #SBATCH -p normal # queue -- normal or development #SBATCH -t 12:00:00 # run time (hh:mm:ss) #SBATCH -A OUR_GRANT_NUMBER #SBATCH --mail-user=YOUR_EMAIL_ADDRESS #SBATCH --mail-type=begin # email me when the job starts #SBATCH --mail-type=end # email me when the job finishes # run athena here ibrun ./athena -t 11:55:00
and save it in a file called stampede.pbs
ibrun
is stampede’s version of mpirun
, and it automatically
keeps track of how many processors to use.
note that i asked for 12 hours, but told athena to quit after 11:55 hours… that gives it 5 minutes to dump its last restart file and exit cleanly before being killed by the supercomputer. that makes sure we’re not left with any half-written files.
you then submit the job using sbatch stampede.pbs
. you’ll get an
e-mail when it starts, and again when it finishes.
you really shouldn’t run anything on the login node, unless it will only take a minute. so to join the files, you’ll want to again use the queue:
#!/bin/bash #SBATCH -J join-vtk #SBATCH -o join.o%j #SBATCH -n 1 #SBATCH -p development #SBATCH -t 01:00:00 #SBATCH -A OUR_GRANT_NUMBER #SBATCH --mail-user=YOUR_EMAIL_ADDRESS #SBATCH --mail-type=begin #SBATCH --mail-type=end ruby ./join-vtk.rb
there’s more information you need – how to log into the computer, how to compile your code, where to run the simulation, and how to copy data back to your laptop. but you just have to memorize those, and it’s easier to go through those in person… they’re particular to whatever computer you use, and how it’s managed. this document just goes over the general principles you’d use on any computer.