Missing Pixel or Grid Value in the output while Calculating Standard Precipitation Index (SPI) using climate_indices library in Python
Opened this issue · 3 comments
I am using NetCDF Gridded dataset of IMD Rainfall (NetCDF can be downloaded from here), for calculating the Standard Precipitation Index (SPI), using climate_indices library of Python
I have changed all the null values to Zero in the Input file.
still the output is giving null values,
even though I have used quite long term data of around 72 years from 1951 to 2022;
import xarray as xr
import pandas as pd
import numpy as np
import climate_indices
from climate_indices import indices
import matplotlib.pyplot as plt
# ########### Open Dataset ##########################################
p = xr.open_dataset(r"C:\Climdex\RF_NC\RF_Single\RF_Monthly_India.nc")
da_precip = p.rf
# replace all values equal to -999 with np.nan
da_precip = da_precip.da_precip(monthly['rf'] != -999.0)
# replace all np.nan values equal to 0
da_precip.fillna(0)
da_precip_groupby = da_precip.stack(point=('lat', 'lon')).groupby('point')
# apply SPI to each `point`
scale = 1
distribution = climate_indices.indices.Distribution.gamma
data_start_year = 1951
calibration_year_initial = 1951
calibration_year_final = 2022
periodicity = climate_indices.compute.Periodicity.monthly
da_spi = xr.apply_ufunc(indices.spi,
da_precip_groupby,
scale,
distribution,
data_start_year,
calibration_year_initial,
calibration_year_final,
periodicity)
da_spi
# unstack the array back into original dimensions
da_spi = da_spi.unstack('point')
# da_spi
legend_levels = [-3,-2,-1,0,1,2,3]
da_spi.sel(time='2015').plot(cmap='RdBu', col='time', col_wrap=4, levels=legend_levels)
How to correct it ?
I have run a slightly modified version of your code using the latest version of this package and I see similar results:
import climate_indices
from climate_indices import indices
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import xarray as xr
# open the rainfall dataset
dataset_path = "/Users/jadams/data/climate/RF_Monthly_India.nc"
ds = xr.open_dataset(dataset_path)
da_precip = ds.rf
# replace all values equal to -999 with np.nan, then all np.nans to 0
da_precip = xr.where(da_precip == -999.0, np.nan, da_precip)
da_precip = da_precip.fillna(0.0)
# group lat/lon cells to facilitate parallelization via xarray's apply_ufunc()
da_precip_groupby = da_precip.stack(point=('lat', 'lon')).groupby('point')
# apply SPI to each `point` in parallel
scale = 9
distribution = climate_indices.indices.Distribution.gamma
data_start_year = 1951
calibration_year_initial = 1951
calibration_year_final = 2022
periodicity = climate_indices.compute.Periodicity.monthly
da_spi = xr.apply_ufunc(indices.spi,
da_precip_groupby,
scale,
distribution,
data_start_year,
calibration_year_initial,
calibration_year_final,
periodicity)
# unstack the array back into original dimensions
da_spi = da_spi.unstack('point')
# plot the results
legend_levels = [-3,-2,-1,0,1,2,3]
da_spi.sel(time='2022').plot(cmap='RdBu', col='time', col_wrap=4, levels=legend_levels)
What I've noticed is that this seems to disappear as the number of"scale" months go up. For 1, 2, 3-month SPI etc. we maybe don't have enough non-zero data for the calculation to give meaningful results, so it returns with null values for those pixels? That's just speculation and we'll need to isolate a pixel where this is happening and follow the code with a debugger to really work it out, at least that's what this kind of thing has involved for me in the past.
BTW thanks @abhilashsinghimd for the detailed issue report with good code and example error results, it makes looking into these things much easier.