monocongo/climate_indices

Getting error: conflicting sizes for dimension 'lat' for SPI calculation from NetCDF

Closed this issue · 10 comments

How to craft a useful, minimal bug report

Describe the bug
I'm trying to calculate SPI from Chirps data in NetCDF format and I'm getting

"ValueError: conflicting sizes for dimension 'lat': length 12 on the data but length 170 on coordinate 'lat'" error from XArray. I've been using the 1.0.9 version, but also tried the 1.0.8 and 1.0.7 versions with the same error.

Any ideas to solve the issue?

To Reproduce
I'm trying to create SPI with the following command :

spi_chirps_2021_v2 % spi --periodicity monthly --scales 12 --calibration_start_year 1981 --calibration_end_year 2020 --netcdf_precip /_temp/spi_chirps_2021_v2/input_nc/tr_clip_chirps_1months_1980_2021.nc --var_name_precip precip --output_file_base /_temp/spi_chirps_2021_v2/output_nc/tr_chirps --multiprocessing all --save_params /_temp/spi_chirps_2021_v2/Fitting/tr_chirps_fitting.nc --overwrite

Desktop (please complete the following information):

  • OS: MacOS X 10.15.7
  • Python Version : 3.7.11

Please post a link to your data source so I can download it and try this myself. Hopefully, I will be able to spot your problem.

dear @monocongo,

you can find the data from the following URL : https://www.dropbox.com/s/8paei26pkdzcflp/tr_chirps-v2.0.monthly.nc?dl=0

thanks for attention.

Best regards.

Hi @alperdincer , I think the issue may be with the process you've used to create the input files for the processing command. You've given me a link to the original CHIRPS dataset (tr_chirps-v2.0.monthly.nc) but it seems that you've used other data files which were extracted from the original (tr_clip_chirps_1months_1980_2021.nc). I'd say your ordering of the dimensions may be off, in that it looks like time and lat dimensions may be swapped since it shows that it's trying to use 12 as the lat dimension whereas 12 is most likely the correct size for the time dimension since we're dealing with monthly data. My hunch is that the conversion of the data from the original CHIRPS to the input file you've used is where this issue has started. Post another link to the actual input dataset you're using (tr_clip_chirps_1months_1980_2021.nc) rather than the original source data, I'll bet we'll find the issue there, perhaps there'll even be a history of the NCO commands you've used to extract the input file and that may help us resolve the issue.

hi @monocongo

I've go through the tutorial from the following link :
https://wfpidn.github.io/SPI/chirpsnc/

  • First I downloaded the original CHIRPS data with :

$ wget -c https://data.chc.ucsb.edu/products/CHIRPS-2.0/global_monthly/netcdf/chirps-v2.0.monthly.nc

  • and cut it with the following coordinates :

$ cdo sellonlatbox,24.342613,46.051598,34.201155,42.685085 chirps-v2.0.monthly.nc tr_chirps-v2.0.monthly.nc

  • The ncdump result of the file is as follows :

netcdf tr_chirps-v2.0.monthly.nc {
dimensions:
time = UNLIMITED ; // (488 currently)
longitude = 434 ;
latitude = 170 ;
variables:
float time(time) ;
time:standard_name = "time" ;
time:units = "days since 1980-1-1 0:0:0" ;
time:calendar = "gregorian" ;
time:axis = "T" ;
float longitude(longitude) ;
longitude:standard_name = "longitude" ;
longitude:long_name = "longitude" ;
longitude:units = "degrees_east" ;
longitude:axis = "X" ;
float latitude(latitude) ;
latitude:standard_name = "latitude" ;
latitude:long_name = "latitude" ;
latitude:units = "degrees_north" ;
latitude:axis = "Y" ;
float precip(time, latitude, longitude) ;
precip:standard_name = "convective precipitation rate" ;
precip:long_name = "Climate Hazards group InfraRed Precipitation with Stations" ;
precip:units = "mm/month" ;
precip:_FillValue = -9999.f ;
precip:missing_value = -9999.f ;
precip:time_step = "month" ;
precip:geostatial_lat_min = -50.f ;
precip:geostatial_lat_max = 50.f ;
precip:geostatial_lon_min = -180.f ;
precip:geostatial_lon_max = 180.f ;

  • Then I've changed the variable names to match with the climate indice package rules as :

$ cdo chname,longitude,lon tr_chirps-v2.0.monthly.nc tr_chirps-v2.0.monthly_1.nc
$ cdo chname,latitude,lat tr_chirps-v2.0.monthly_1.nc tr_chirps-v2.0.monthly_2.nc
$ cdo -setattribute,precip@units="mm" tr_chirps-v2.0.monthly_2.nc tr_chirps-v2.0.monthly_3.nc

  • I've also tested with NCO commands as :
    $ ncrename -d longitude,lon -d latitude,lat -v longitude,lon -v latitude,lat tr_chirps-v2.0.monthly.nc -O tr_chirps-v2.0.monthly_2.nc

$ ncatted -a units,precip,modify,c,'mm' turkey_chirps-v2.0.monthly_2.nc turkey_chirps-v2.0.monthly_3.nc

  • The ncdump result of the new file is as follows (for CDO command process) :
    netcdf turkey_chirps-v2.0.monthly_3 {
    dimensions:
    time = UNLIMITED ; // (488 currently)
    lon = 434 ;
    lat = 170 ;
    variables:
    float time(time) ;
    time:standard_name = "time" ;
    time:units = "days since 1980-1-1 0:0:0" ;
    time:calendar = "gregorian" ;
    time:axis = "T" ;
    float precip(time, lat, lon) ;
    precip:standard_name = "convective precipitation rate" ;
    precip:long_name = "Climate Hazards group InfraRed Precipitation with Stations" ;
    precip:_FillValue = -9999.f ;
    precip:missing_value = -9999.f ;
    precip:time_step = "month" ;
    precip:geostatial_lat_min = -50.f ;
    precip:geostatial_lat_max = 50.f ;
    precip:geostatial_lon_min = -180.f ;
    precip:geostatial_lon_max = 180.f ;
    precip:units = "mm" ;
    float lon(lon) ;
    lon:standard_name = "longitude" ;
    lon:long_name = "longitude" ;
    lon:units = "degrees_east" ;
    lon:axis = "X" ;
    float lat(lat) ;
    lat:standard_name = "latitude" ;
    lat:long_name = "latitude" ;
    lat:units = "degrees_north" ;
    lat:axis = "Y" ;

  • After, I've run the command in the first post to start indice calculation.

Please let me know if you need any of the data used in this process.

Best regards.

Alper.

This makes more sense now. One time before the NASA/ARSET folks put together a training using this package and didn't check their work -- it sent quite a few students here with issues and questions. This seems to be another example of that happening, I'm not sure why they can't work with me to make something that actually works as advertised. Sorry for the rant.

My guess is that the issue is not with this package but with the data preparation outlined in the training. But I don't have the time to look into this now to confirm. My recommendation is to contact the creators of the training and asked them if they can conclusively reproduce the results they're showing in the tutorial. If so then you've gone off the rails somewhere, and if not then the tutorial is in error and should be corrected so users aren't continually frustrated and blame the issues on this package.

I apologize for not being to help much with this now, deadlines at my day job, etc. I might at some point be able to go through the training myself to find any errors that are causing this to happen for you but I don't want you to be disappointed if it takes me weeks to get around to it, so be aware of that.

Thanks for the feedback. At least I know where to start checking.

If I'll find the problem before you, I'll let you know about the progress.

Yes, please do. If there's a way to update their materials if we find an issue there then let's try to help with that as well, because otherwise it's very well put together and obviously a lot of work has gone into it. However, if this works out to be an actual bug with this code then of course we are eager to find and fix the problem. This package has fallen out of date and some other things are not working as before so anything is possible. I hope to be able to take a weekend and really clean house soon!

I ran into this bug today and figured out the cause. The issue is that my input files had the dims and the coords in different orders from each other, which is perfectly valid for netcdf. When building the fitting params and the final spi product DataArrays, the dims list confuses xarray because it implies a data shape that is different from the input numpy array. I have an initial patch for this. I am double-checking it, locally, but I should have a PR submitted soon.

"dims and coords in different orders from each other" -- Erm, probably better described as the dataset's coords is in a different order from the individual variable's dimension order.

@alperdincer Are you able to try out my pull request #448 to see if it solves your problem? My guess is that for you, NCO is being triggered internally to change the dimension order of the precip data, but this doesn't change the coordinate order, so there is a mismatch happening for you. My case is similar since my own code performs the transpose before sending the data into spi, and the coordinate order of the input data is completely different than the expected dimension order.