GFDRR/CCDR-tools

Parallel code refinement

matamadio opened this issue · 2 comments

Parallelization WORKS on Linux and Win! Thanks @artessen and @ConnectedSystems for this magic!

Remains some issues to solve:

  • Does work for function, but not for classes
  • The code can be more efficient: we don't need EAI calculation as a raster. EAI calculation is done on the table output, after zonal aggregation, and presented as output chart. See #2
  • The zonal step seem to introduce an overestimation error ONLY for total Pop; when tested on DOM against original code and QGIS zonal, the error is between 1% and 50%. The raster creation is checked to be correct according to exported byproducts.
     with mp.Pool(cores) as p:
        # Get total exposure for each ADM region
        func = partial(zonal_stats_partial, raster=exp_ras, stats="sum")
        stats_parallel = p.map(func, np.array_split(adm_data.geometry, cores))

    exp_per_ADM = list(it.chain(*stats_parallel))

immagine

Thanks Arthur for refining the code.