ecmwf/cfgrib

cfgrib not reading multiple accumulated precipitation fields

blaylockbk opened this issue ยท 9 comments

I have a GRIB2 file from the High-Resolution Rapid Refresh model that includes two accumulated precipitation fields. Here is the output of wgrib2:

1:0:d=2019070100:APCP Total Precipitation [kg/m^2]:surface:0-5 hour acc fcst:
2:320019:d=2019070100:APCP Total Precipitation [kg/m^2]:surface:4-5 hour acc fcst:

As you can see, this file has accumulated precipitation for two different hour ranges: 0-5 hours, and 4-5 hours.

When I read this file with cfgrib (cfgrib.open_datasets), it only shows one of the fields. I wonder if that is because the GRIB message for both is the same name, and so it overwrites the first with the second message when it reads. It appears that cfgrib doesn't recognize that the two different fields should be two different xarray variables.

image

Is there any way to overcome this, or am I missing something? Thanks for any help.

@blaylockbk this looks interesting and you are right, cfgrib probably overwrites the content of one variable with the other.

In order to suggest a workaround I need to know the name of a key that distinguishes the two, can you point me to a sample of the data.

Thanks for looking into this @alexamici. I appreciate it.

This is a subset of a HRRR (high-resolution rapid refresh) file, with just the two accumulated precipitation fields I am working with:
https://github.com/blaylockbk/HRRR_archive_download/raw/master/sample_data/hrrr/20201214/subset_20201214_hrrr.t00z.wrfsfcf12.grib2

For additional context, here is the full HRRR file that above subset comes from:
https://storage.googleapis.com/high-resolution-rapid-refresh/hrrr.20201214/conus/hrrr.t00z.wrfsfcf12.grib2

The "stepRange" key is different. Doing a grib_ls shows this

% grib_ls subset_20201214_hrrr.t00z.wrfsfcf12.grib2
... stepRange    shortName
... 0-12         tp
... 11-12        tp

I'm not sure what cfgrib does with step ranges, but I guess what we want is a single tp variable with a step dimension with 2 steps. Normally, I'd suggest to just use the end step to denote this, but in this case, it's the end steps that are the same!

ah, I see.

I tried to filter based on the stepRange when loading it with xarray, but get a value error

xr.open_dataset(FILE, engine='cfgrib', 
                backend_kwargs={'filter_by_keys': {'stepRange': '0-12'}})
ValueError: 'stepRange' is not in list

@blaylockbk good guess! Unfortunately stepRange is not normally read, try add "read_keys": ["stepRAnge"] to backend_kwargs.

Thanks! This was the work around for now--open both GRIB messages separately:

xr.open_dataset(FILE, engine='cfgrib', 
                backend_kwargs=dict(read_keys=['stepRange'],
                                    filter_by_keys={'stepRange': '11-12'}))

and

xr.open_dataset(FILE, engine='cfgrib', 
                backend_kwargs=dict(read_keys=['stepRange'],
                                    filter_by_keys={'stepRange': '0-12'}))

would read both variables.

Would be nice if cfgrib could read both in the same open line, like Iain said by extending the dimension.

@blaylockbk upcoming version 0.9.9.0 will not require to add read_keys. Thanks for you report!

thank you for this!