OCHA-DAP/pa-anticipatory-action

Ensure correct computation observed terciles

Closed this issue · 4 comments

We are using CHIRPS data to compute historically observed below average precipitation.
This code was written some time ago, but never properly code reviewed.
Since we are now using it more often, it would be good to have a second pair of eyes looking at it.

The code can be found here (line 195-258).
The output is for example used for Chad here

In general it looks decent to me but why I want to double check is because

  1. in both bfa and tcd we see much more prevalence of below average in the 80's and 90's than in the last 20 years. Maybe precipitation patterns or technologies changed but want to be sure it is not us
  2. We don't see great correspondence between years with bavg precipitation and reported drought years. I am not perse surprised about this, but better be sure before we make claims on that.

As sanity checks I did simple things like making sure the occurence of below average is 1/3. I also manually compared a subset of the dates to the anomalies on CHC's early warning explorer but mainly for more recent dates.

afbeelding
Actually the prevalance of bavg precipiation in the 80s might be correct, see e.g. this messy graph from here

I've looked very carefully at this method (worked through an example myself) and it all looks good to me. Some minor comments but I don't think they will have any effect on the results:

  • for the quantile function:
    • it could be good to set skipna = True
    • instead of passing 0.33 I would pass 1/3 (more precise)
  • you don't need to change it now, but I think a cleaner way to indicate which pixels fall in each tercile could be to use a mask and add it as an additional variable to the dataset. Something like:
ds_lt = ds_season.groupby('month').apply(lambda x: 
                x.where(x < ds_season_climate_quantile.sel(month=x.time.dt.month)))
ds_season = ds_season.assign({'lower': ds_lt['precip']})

As for the notebook, I think it looks good too! I checked things like plotting all the rasters and calculating total number of pixels in the tercile, etc.

Sometimes there are weird edges in the boundaries:
image
But as far as I can tell they are correct.

This is a plot of the fraction of pixels with below-average rainfall over time:
image
so essentially the inverse of the plot you posted. I think the trends kind of match -- e.g. the rainfall peak in 1995 seems present.

Did I ever say before you are a genius? ;) Thanks a lot, that gives a lot more confidence! Working on implementing your excellent ideas, will get back to it later.

This has been inspected by #250 with the conclusion that methodologically it is correct.
Some code improvements have been implemented in #254