fmihpc/analysator

Provide basic parallel capabilities

Closed this issue · 3 comments

It is rather easy to use multiprocessing for trivial parallelisation of tasks in python scripts. At least for some operations like the ones in #9 it is very useful. I suggest to implement this in the tools where it is useful.

Example:

data = np.zeros([nBins, len(cellids)])

parallel_job = 1

def parallel_worker(id):
   print id
   global data
   # Read the velocity cells:
   velocity_cell_data = vlsvReader.read_velocity_cells(cellids[id])
   if len(velocity_cell_data) != 0:
      # Get cells:
      vcellids = velocity_cell_data.keys()
      # Get avgs data:
      avgs = velocity_cell_data.values()
      # Get a list of velocity coordinates:
      v = vlsvReader.get_velocity_cell_coordinates(vcellids)
      return [id, np.asarray(pl.histogram(v[:,0], bins=nBins, weights=avgs, range=vRange)[0])]
   else:
      return [id, np.asarray(pl.histogram([0], bins=nBins, weights=[1e-1], range=vRange)[0])]


if parallel_job == 1:
   from multiprocessing import Pool
   if __name__ == '__main__':
      pool = Pool(8)
      xdata = pool.map(parallel_worker, celllist)
      for xlines in xdata:
         data[:,xlines[0]] = xlines[1]
else:
   for id in celllist:
      xid, xdata = parallel_worker(id)
      data[:,xid] = xdata

For MayaVi velocity space plotting is done via tasks, and the velocity space is pieced into chunks (and parallelized, if I remember correctly, although the main reason is to optimize memory)

Some of the Numpy operations can be parallelizd with numexpr, but the syntax is a bit clumsy. Anyway, yep we can implement parallelization in critical areas of the code!

Also: Nice work on the space-vel plotting, it looks really nice

Let's look into this when it becomes more of a bottle neck. There are some basic parallel capabilities within MayaVi.