SciTools/iris-grib

iris-grib choking on zero arrays

Opened this issue · 4 comments

Hi, we're having a problem with occasional failure loading the HRRR data because iris-grib removes the data when it's a zero-array and then claims it never existed in the first place:

bug.grib2.zip

import gribapi
bug_example_input = "bug.grib2"
with open(bug_example_input, 'rb') as f:
    message_id = gribapi.grib_new_from_file(f)
coded_values = gribapi.grib_get_double_array(message_id, "codedValues")
print("codedValues [type={}]: {}".format(type(coded_values), coded_values))

codedValues [type=<class 'numpy.ndarray'>]: [0. 0. 0. ... 0. 0. 0.]

^-- The data is there and valid, but all zeroes.

But iris-grib can't load it:

import iris_grib

cube = next(iris_grib.load_cubes(bug_example_input))
cube.data

<stacktrace removed for brevity> KeyError: "'codedValues' not defined in section 7"

You might wonder if the data's corrupt, that "codedValues" is present but not in section 7. I'll spare you the details but I confirmed that it was in fact in the right place. Unfortunately, there's an upstream call in iris_grib.message._RawGribMessage._get_message_keys to gribapi.grib_skip_computed(keys_itr), which removes the zero array, as demonstrated here:

def summarize_keys(message_id, skip=False):
    keys_itr = gribapi.grib_keys_iterator_new(message_id)
    keys = []
    if skip:
        gribapi.grib_skip_computed(keys_itr)
    while gribapi.grib_keys_iterator_next(keys_itr):
        key = gribapi.grib_keys_iterator_get_name(keys_itr)
        keys.append(key)   
    print("With skip {}, there are {} keys and it {} contain codedValues.".format(
        skip, len(keys), "does" if "codedValues" in keys else "does not"))
    gribapi.grib_keys_iterator_delete(keys_itr)

    
summarize_keys(message_id, skip=False)
summarize_keys(message_id, skip=True)

With skip False, there are 243 keys and it does contain codedValues.
With skip True, there are 96 keys and it does not contain codedValues.

Arguably this is a bug in gribapi, but that seems to have been end-of-lifed a while ago and the trail disappears into the C code from my perspective.

I'll probably try just removing the grib_skip_computed call and see what happens, but given it's removing the majority of keys that seems like quite a large profile for error. I'm hoping someone here knows a good fix.

Edit: This is also happening in another grib message where the data has a nonzero fill value of -50.

pp-mo commented

Thanks for this.
Unfortunately, we are just about to release a "v0.17" ...
... and I don't think we have time to investigate/fix this just yet so it won't make it in 😞
Hope that will be ok. We can always do a another release soon (it isn't hard).

Meanwhile...

this is a bug in gribapi, but that seems to have been end-of-lifed a while ago

Actually iris-grib is using ecCodes + that is what we test against here (CI).
They retained the name 'gribapi' as an optional alias for the package, and we are still using that.
(perhaps we should not).

Thanks for clarifying about the ecCodes, I'm new to grib and it's been kind of confusing. Updating the package name at some point would probably help people desperately googling for answers in the future.

From our perspective, having this data missing is acceptable––in most cases the variable being absent would mean the same thing as the zero array, I think, and the commonly-used variables aren't affected. So we don't need an urgent fix, although I do wonder whether this was the real problem behind some of the issues reported in the past that gave the "'codedValues' not defined in section 7" error message where the bug reporter just kind of wandered off mumbling that maybe the data was corrupt.

Are there any plans to address this in an upcoming release? We are seeing this issue in a number of files containing all-zero messages.

mpkay commented

Hi -

This issue is biting us as well. Am attaching a sample GRIB2 message that can be read "correctly" by ecCodes (and wgrib2, FWIW) to return an ndarray of correct size and populated with all zeros as expected. From what we've seen (and haven't confirmed) is that NCEP's g2clib has an optimization for constant fields to leave the values section (7) empty.

Here's a simple example program to confirm the behavior:

import eccodes as ec

filename = "/tmp/empty_section7.grib2"

with open(filename, "rb") as fp:
    msg = ec.codes_grib_new_from_file(fp)
    ny = ec.codes_get(msg, 'Nj')
    nx = ec.codes_get(msg, 'Ni')
    data = ec.codes_get_values(msg).reshape((ny, nx))
    ec.codes_release(msg)
    print(type(data))
    print(data.shape)
    print(data.min(), data.max())

This yields:
<class 'numpy.ndarray'> (721, 1440) 0.0 0.0

Here's a data file (zipped for upload) that demonstrates one of the messages in question.
empty_section7.zip