optimize and parallelize processing

Question

optimize and parallelize processing

Closed this issue a year ago · 1 comments

Preprocessing is starting to look like the bottleneck in seismic imaging, with numerical simulations getting faster and faster. Although the processing step is small, we usually do not do it efficiently, running things in serial and doing lots of IO operations.

I have taken a few stabs at parallel processing using Python's concurrent.futures for embarrassing parallelization using a multiprocessing schema. I got this working to some extent but has also lead to computer-crashing runs on my Linux workstation with almost no error handling. My thinking was that the futures overconsumed the available RAM and the OS and monitor were no longer able to keep up (eek).

Parallel implementation was located here: https://github.com/adjtomo/pyatoa/blob/devel/pyatoa/core/executive.py

This is a bookmark to remind myself to get back at this, perhaps with a different approach (MPI4Py, multiprocessing?), and to get things working on HPCs

Answer 1 · 2023-08-25T20:20:41.000Z

This was addressed with the Executive class and the MPI example. Responsibility for multiprocessing in larger workflow tools is left to the discretion of the tool calling Pyatoa.