running athena in parallel

you can speed up your simulations by running athena in parallel, basically splitting up your simulation domain into N equal-sized chunks and then running each one on a separate processor. on each of these chunks, the boundary conditions come from the simulation running on the neighboring processor. athena uses something called MPI (message passing interface) to communicate information among processors. i would recommend not looking into the details of MPI unless you have to.

configuring and compiling

to make use of multiple processors, you first have to re-compile athena with MPI enabled

$ ./configure --with-gas=hydro --with-problem=... --enable-mpi
$ make all

(you may need to install MPI before doing this… if which mpirun doesn’t give you anything, try $ brew install open-mpi)

setting up the simulation

next, you need to tell athena how many processors to use. this happens via the NGrid_x* entries in the athinput file. for example:

<domain1>
level     =  0
Nx1       =  8192
x1min     = -2.5e3
x1max     =  2.5e3
bc_ix1    =  4
bc_ox1    =  4


Nx2       =  8192
x2min     = -2.5e3
x2max     =  2.5e3
bc_ix2    =  4
bc_ox2    =  4

Nx3       =  1
x3min     = -0.5
x3max     =  0.5
bc_ix3    =  4
bc_ox3    =  4

NGrid_x1  =  16
NGrid_x2  =  32
NGrid_x3  =  1

this runs a (1892)^2 simulation on 16*32 = 512 processors. i’ve used this for some recent simulations on a supercomputer. you won’t have 512 processors on your laptops, so maybe try NGrid_x1 = NGrid_x2 = 2 to begin with.

when you chose your own parameters, remember that the “surface” of cells on your domain need to be communicated as boundary conditions at the interfaces among processors. communication has to happen at every time-step, and can easily become a bottleneck if you’re not careful. so you don’t want to make the sub-domains too small, and you want to make them as “square” as possible to minimize the surface area to volume ratio.

at the same time, super-computers are organized into hierarchical “nodes,” each of which contains some number of processors. on the stampede computer, each node contains 16 processors, so we need the total number we ask for to be an integer multiple of 16.

(if you know the distinction between cores and processors, i’m really talking about cores here.)

running your simulation

to run your simulation, you need to do it through MPI:

$ mpirun -np 4 ./athena -i athinput

again, it’s best not to go into the details of what happens here… you shouldn’t need it. this example runs on 4 processors, which needs to be the same as the total number you ask for in the athinput file from the last section.

you’ll notice that athena makes four new directories, id0, id1, id2, and id3. each one contains vtk files with data from its corresponding processor. in order to read these files into mathematica, you first need to join these files.

joining VTK files

athena provides a small c program for joining vtk files. i’ve copied it in this directory. to use it, first compile it:

$ gcc -o join_vtk.x join_vtk.c -lm

this makes an executable called join_vtk.x. you can run it directly, but i’ve made a wrapper program called join-vtk.rb which should make it easier for you. in the simulation directory, run

$ ruby join-vtk.rb

and it will make a directory called merged with all of the merged vtk files. you can go ahead an analyze these files in the same way you’ve been working with the files from your single-processor (“serial”) simulations.

restart files

supercomputers usually use a kind of time-share system… you don’t get to run your job directly on the computer. instead, you submit instructions for running it and request the resources to do so. it goes into what’s called a “queue,” and some software schedules all the jobs to run when the resources become available.

in order to keep you from hogging the machine, supercomputers usually only allow you to run your simulations for some fixed length of time. after that, it kills your job and you have to resubmit it and return to the back of the line.

so you’ll need to be able to use “restart” files, which periodically save the simulation grid to the hard drive and enable you to pick up the simulation where you left off.

setting up the simulation

first, you need to ask athena to give you restart files. this is very similar to the VTK files you know about:

<output3>
out_fmt  =  rst
dt       =  1.0e3

fixing the code

remember the problem() function that you work on to define your simulation? that has a couple of jobs:

  1. read in parameters from the athinput file (eg, par_get_d())
  2. set up the initial condition by writing values to the grid
  3. set up special boundary conditions, turn on gravity, history outputs, etc.

the function problem() gets called before the simulation starts. there’s an analogous function problem_read_restart() which gets called before a restart simulation begins. you need to do all of the same things here as in the problem() function except #2. since the simulation is already in progress, you don’t want to overwrite the data with your initial condition!

everything else needs to be exactly the same in both problem() and read_restart(). (eg, if you change the order of the history functions, the order of the columns with change halfway through the file and it becomes really hard to read.) i will often make a function set_vars() and call it from both functions.

DANGER: one last thing to think about. if you have some global variable that you use in userwork_in_loop or anything else, this also needs to be communicated to the restarted simulation. to do this, you just stick it to the end of the restart file using write_restart(). for example, if you use variables x_shift and vflow to keep track of the shock interface and try to follow it in your simulations, you need to save these as well. you just simply write them to the end of the file:

void problem_write_restart(MeshS *pM, FILE *fp)
{
#ifdef FOLLOW_SHOCK
  fwrite(&x_shift, sizeof(Real), 1, fp);
  fwrite(&vflow,   sizeof(Real), 1, fp);
#endif

  return;
}

you then read these back in the read_restart() function:

void problem_read_restart(MeshS *pM, FILE *fp)
{             
  int nl,nd;  
              
  /* set boundary conditions.  since this function takes a Mesh, you
     need to "find" the Domain */
  for (nl=0; nl<(pM->NLevels); nl++) {
    for (nd=0; nd<(pM->DomainsPerLevel[nl]); nd++) {
      if (pM->Domain[nl][nd].Disp[0] == 0)
        bvals_mhd_fun(&(pM->Domain[nl][nd]), left_x1, bc_ox1);
    }         
  }           
              
  /* par_getd statements copied from problem() */
              
  /* dump_history_enroll statements copied from problem() */
              
  /* gravity, etc. statements copied from problem() */
              
  /* DANGER: make sure the order here matches the order in write_restart() */
#ifdef FOLLOW_SHOCK
  fread(&x_shift, sizeof(Real), 1, fp);
  fread(&vflow,   sizeof(Real), 1, fp);
#endif        
              
  return;     
}             

running and restarting your simulations

once you have this all set-up, you can run athena with a time constraint:

$ ./athena -t 00:02:00

the format is hh:mm:ss, so the above should quit after two minutes. in each of the idn/ directories, you should find a restart file:

$ ls id0/*.rst
id0/sim.0000.rst id0/sim.0001.rst id0/sim.0002.rst

to restart the simulation, you want to take all of the last rst files and move them to the simulation directory:

$ find id* -name '*.0002.rst' | xargs -I % mv % ./

now, you can restart the simulation:

$ ./athena -t 00:02:00 -r sim.0002.rst

before running anything on the supercomputer, you should always test that the restarts work on your laptops. run one low-resolution simulation all the way through, then run another restarting it once or twice. at the end, the outputs should be identical – if there’s any difference, it indicates some bug in the read_restart() function that needs to be fixed.

you should also do this for a couple of different processor configurations (2x1, 1x2, 2x2, and serial) to make sure nothing funny happens at the internal MPI boundaries. it’s much easier to catch these errors running small tests on your laptops than to try and deal with them on the supercomputer.

batch system on supercomputers

once you know your code works with MPI and with restarts, you’re ready to run on the supercomputer!

as i mentioned earlier, you don’t get to use the computer directly. instead, you log into a separate computer called the “login node.” on this computer you prepare a little script with instructions for running your simulation and then submit it to the queue.

for example, to run a simulation of 512 processors for 12 hours, you might use:

#!/bin/bash
#SBATCH -J SOME_NAME_FOR_THE_JOB
#SBATCH -o myMPI.o%j
#SBATCH -n 512              # total number of processors requested
#SBATCH -p normal           # queue -- normal or development
#SBATCH -t 12:00:00         # run time (hh:mm:ss)
#SBATCH -A OUR_GRANT_NUMBER
#SBATCH --mail-user=YOUR_EMAIL_ADDRESS
#SBATCH --mail-type=begin   # email me when the job starts
#SBATCH --mail-type=end     # email me when the job finishes

# run athena here
ibrun ./athena -t 11:55:00

and save it in a file called stampede.pbs

ibrun is stampede’s version of mpirun, and it automatically keeps track of how many processors to use.

note that i asked for 12 hours, but told athena to quit after 11:55 hours… that gives it 5 minutes to dump its last restart file and exit cleanly before being killed by the supercomputer. that makes sure we’re not left with any half-written files.

you then submit the job using sbatch stampede.pbs. you’ll get an e-mail when it starts, and again when it finishes.

you really shouldn’t run anything on the login node, unless it will only take a minute. so to join the files, you’ll want to again use the queue:

#!/bin/bash
#SBATCH -J join-vtk
#SBATCH -o join.o%j
#SBATCH -n 1
#SBATCH -p development
#SBATCH -t 01:00:00
#SBATCH -A OUR_GRANT_NUMBER
#SBATCH --mail-user=YOUR_EMAIL_ADDRESS
#SBATCH --mail-type=begin
#SBATCH --mail-type=end

ruby ./join-vtk.rb

there’s more information you need – how to log into the computer, how to compile your code, where to run the simulation, and how to copy data back to your laptop. but you just have to memorize those, and it’s easier to go through those in person… they’re particular to whatever computer you use, and how it’s managed. this document just goes over the general principles you’d use on any computer.