monocongo/climate_indices

cannot reshape array when calculating the SPEI from precipitation and temperature

Opened this issue · 3 comments

Describe the bug
I use monthly precipitation and air temperature from Jan 1995 to June 2021 to calculate SPEI. process_climate_indices raises a ValueError: cannot reshape array of size 416697660 into shape (465,705,318). The error does not occur when the intermediate pet_thornwaite.nc is used as input (although the output netcdfs sadly contain only NaN).

To Reproduce
The original netCDF are from EOBS (variables tg and rr) and aggregated to monthly mean temperature tg and total precipitation rr using cdo.

The original dimensions and variables longitude and latitude were renamed to lon and lat:

ncrename -d longitude,lon -d latitude,lat -v longitude,lon -v latitude,lat rr_monthly_1995-2021.nc spei/rr_monthly_1995-2021.nc
ncrename -d longitude,lon -d latitude,lat -v longitude,lon -v latitude,lat tg_monthly_1995-2021.nc spei/tg_monthly_1995-2021.nc

The dimensions were reordered according to the example files, i.e. (lat, lon, time) and the variable "time_bnds" removed.

ncpdq -a "lat,lon,time" rr_monthly_1995-2021.nc rr_monthly_1995-2021_reordered.nc
ncks --fix_rec_dmn lat rr_monthly_1995-2021_reordered.nc -o outfixed.nc ; mv outfixed.nc rr_monthly_1995-2021_reordered.nc
ncks --mk_rec_dmn time rr_monthly_1995-2021_reordered.nc -o outunlim.nc ; mv outunlim.nc rr_monthly_1995-2021_reordered.nc
ncks -C -O -x -v time_bnds rr_monthly_1995-2021_reordered.nc outfixed.nc ; mv outfixed.nc rr_monthly_1995-2021_reordered.nc
ncpdq -a "lat,lon,time" tg_monthly_1995-2021.nc tg_monthly_1995-2021_reordered.nc
ncks --fix_rec_dmn lat tg_monthly_1995-2021_reordered.nc -o outfixed.nc ; mv outfixed.nc tg_monthly_1995-2021_reordered.nc
ncks --mk_rec_dmn time tg_monthly_1995-2021_reordered.nc -o outunlim.nc ; mv outunlim.nc tg_monthly_1995-2021_reordered.nc
ncks -C -O -x -v time_bnds tg_monthly_1995-2021_reordered.nc outfixed.nc ; mv outfixed.nc tg_monthly_1995-2021_reordered.nc

_validate_args() has to be modified (#465, climate_indices.main.revised.py.zip) before running...
process_climate_indices --index spi --periodicity monthly --var_name_precip rr --netcdf_precip rr_monthly_1995-2021_reordered.nc --var_name_temp tg --netcdf_temp tg_monthly_1995-2021_reordered.nc --output_file_base eobs_spi_ --scales 12 --multiprocessing all_but_one --calibration_start_year 1995 --calibration_end_year 2021

Expected behavior
The SPEI should be calculated, or a meaningful error message shown.

Desktop (please complete the following information):

  • OS: Ubuntu
  • Version: 18.04

Additional context
The file eobs_spei__pet_thornthwaite.nc is created. Replacing SPEI by SPI works.

The reported array size is 2 * time * lat * (lon * 2 - 1) = 416697660. The shape of my netCDFs is (465,705,318) = (lat, lon, time). I don't know where the -1 and the *2's are coming from. The error occurs in function _apply_along_axis_double: first_np_array = np.frombuffer(first_array.get_obj()).reshape(shape)

I'm not sure if this is exactly the same issue you described, but I was getting similar error messages and in my situation it seems like it might be a result of how the netcdfs are being loaded with xr.open_mfdataset() in _compute_write_index(). Specifically, it looks like the latitude dimension is being doubled (maybe since there are two data variables that are being merged/combined?).

I tried a bunch of different input parameter options to open_mfdataset() and eventually found that setting join='override' made it so the latitude dimension didn't change. The data variable arrays have the correct shape now and don't end up all nodata.

Sorry it's taken me so long to get around to this. Have you tried the fix suggested here, @itati01 ? What was necessary to change in the arguments validation function?

Hi,
I used another tool to achieve the calculation, so forgot to test the workaround. It works (although I did not examined the output file)! So, in addition to PR #465 for the validation function, we seemingly need _dataset = xr.open_mfdataset(files, chunks=chunks, join='override') in _compute_write_index().