Efficiency of statistics
matamadio opened this issue · 1 comments
matamadio commented
Currently we have this loop:
for rp in valid_RPs:
# Get total population for each ADM2 region
pop_per_ADM = gen_zonal_stats(vectors=adm_data["geometry"], raster=pop_fn, stats=["sum"])
result_df[f"{adm_name}_Pop"] = [x['sum'] for x in pop_per_ADM]
# Load corresponding flood dataset
flood_data = rxr.open_rasterio(os.path.join(flood_RP_data_loc, f"{country}_RP{rp}.tif"))
At the beginning, it run the zonal over total population. This should be out of the loop, since total population value does not depend on RP: it gets extracted 3 times, but the value is always the same.
However, the code fails if I move the line before the loop :(
ConnectedSystems commented
You probably tried to move the last line indicated above, which loads NPL_RP{10, 100, 1000}.tif
, which is why it fails.
Moved it out of the loop now.