pennmem/ptsa

Question about parallel processing

Closed this issue · 3 comments

Hi,

This is less of an issue and more of a question about how the sausage is made.

The documentation for some of the filtering methods notes:

"it utilizes a C++ thread pool to parallelize the computations."

How does this differ/improve upon taking advantage of Dask integration with xarray?

At the time we were writing Morlet Wavelet filter integration of xarray with Dask was not that advanced. So what are the differences? C++ approach we tool utilizes multiple threads and plots workload between threads. This is all done in C++. Dask approach operates at the Python level and you should be able to split work even across different computing nodes. In practice though, the nice thing about having optimized code in C++ is that it bypasses scripting language layer for most time critical operations. I suspect that if we benchmark the two approaches on the same number of CPU's c++ solution would be faster, by, how much I am not sure but at the time of writing the code this was the correct solution. We did try parallelization at the Python level back then but it was always suboptimal compared to pure C++

I see, thanks for the answer!!