monocongo/climate_indices

SPEI notebook

Opened this issue · 4 comments

Hi,
I might be missing a resource, but I can't seem to locate a notebook where the SPEI is calculated, as an example. Having a notebook going through the calculation steps would be very useful, especially when the data is coming from another domain/source and will have to be pre-processed to fit the structure of the algorithm.

The read.me leads to a publication: https://www.researchgate.net/publication/252361460_The_Standardized_Precipitation-Evapotranspiration_Index_SPEI_a_multiscalar_drought_index, which is not publicly available, so checking the function would be extremely difficult without a notebook going through the calculation steps.

I checked the Palmer_Drought_Index notebook, and this notebook is well documented, and I understand the effort that goes into producing these notebooks, but having examples makes the library much more user-friendly.

I look forward to your comments, and I am more than happy to contribute to a notebook that handles data outside the Continental USA.

Thanks for the impetus for this, @GvdDool

In a nutshell, there are core index functions in the indices.py module. These operate on a time series of data, typically monthly or daily values. Usually, datasets are grids of time series data, i.e. at each lat/lon point. We've provided a processing script to handle applying the index functions to all of the time series of the grid, this code is contained in the __main__.py and is invoked via the command process_climate_indices if the package is installed in Python environment (conda or venv are recommended). So to create a notebook we'd emulate that script in notebook form.

Bear in mind that the package is not specific to any geographic region and can be used on any time series of precipitation/temperature data.

I have created an example notebook for SPI which leverages xarray for parallelization but it doesn't include the multiprocessing features included in our processing script. So we can do something similar for SPEI. I'd like to improve the processing script in favor of a more elegant approach in which tools such as xarray or ray are used for the parallelization and multiprocessing, but I've not yet worked out how to do that so we're still using our own Rube Goldberg machine that leans on shared data arrays.

If you have the time and inclination I'd suggest using the SPI notebook referenced above as a template for an SPEI notebook. If you can work out how to use xarray for multiprocessing in a better way then please let me know as I have wanted to crack that nut for years and it never quite worked (for example see this discussion from the last time I tried).

Thanks James (monocongo),
I have not thought about parallelization and multiprocessing, yet; I am trying to understand the Palmer Drought Index notebook first, while using the data over my project area. This is already an interesting problem because I would like to use the NEX-GDDP data in the GEE for locations in an area with a lot of elevation differences. This means I need to downscale the input before inserting it in the PDSI function.

Parallelization and multiprocessing would help, for sure, but I think that I can use a simple loop for now, going over the X: 26 Y: 30 low-resolution area working on 14K Uber H3-res10 cells; I know, this is not elegant either, using a loop over 14K elements, but I will need to correct for the temperature difference over the 0.11 degree grid, and I don't know how to do this in a vectorised structure (this is above my skill level)

I will keep you posted on my progress, and when possible, I will share a notebook with my solution.

I caution you to not use the PDSI code herein. It is not yet fully vetted and is only included as the "first draft" for someone to use as a start of a proper implementation of the Palmer indices. The results using this code will match those of the code used by NOAA but not in every case, and without further understanding of why this or that or any other PDSI implementation is valid then I don't recommend using this one. It was derived from the NOAA PDSI Fortran plus a few other implementations I managed to find (for example there is no self-calibration in the NOAA version). I am not a climate scientist myself and leaned heavily on the scientific staff at NOAA to vet the results and methodology, and the resident expert on Palmers there, Richard Heim, just didn't have the bandwidth to hammer out the last details on this, and so here we are with what's essentially an unfinished project. The SPI, SPEI, PET, and others in this package are solid, but the Palmers are only included for completeness and hopefully as a launching point for others to continue/finish the work.

Thanks for the heads-up - the underlying functions in the notebook looked very complete (not being an expert in the drought myself, and only having a basic understanding of the soil properties, coming from my groundwater background). Like I said in my previous post, I have to deal with the downscaling first, and PSDI was a nice extra, as I selected the KBDI as the main index, supported with SPI/SPEI.