ecmwf/cfgrib

ValueError: multiple values for unique key... on GRIB files with more than one value of a key per variable.

bbonenfant opened this issue · 24 comments

I am able to successfully load the test grib file that was suggested in the README, however when I try to read a grib file such as the one below I get the following error output.

> import cfgrib
> ds = cfgrib.Dataset.frompath('nam.t00z.awip1200.tm00.grib2')
Traceback (most recent call last):
File "<stdin>", line 1, in
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/cfgrib/dataset.py", line 482, in frompath
return cls(stream=messages.Stream(path, mode=mode, message_class=CfMessage), **kwargs)
File "<attrs generated init baa5906ed7dcdc8b722f343b3fe827a76110eccb>", line 7, in init
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/cfgrib/dataset.py", line 485, in attrs_post_init
dims, vars, attrs = build_dataset_components(**self.dict)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/cfgrib/dataset.py", line 457, in build_dataset_components
var_index, encode_parameter, encode_time, encode_geography, encode_vertical,
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/cfgrib/dataset.py", line 372, in build_data_var_components
data_var_attrs = enforce_unique_attributes(index, data_var_attrs_keys)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/cfgrib/dataset.py", line 150, in enforce_unique_attributes
raise ValueError("multiple values for unique attribute %r: %r" % (key, values))
ValueError: multiple values for unique attribute 'typeOfLevel': ['hybrid', 'cloudBase', 'unknown', 'cloudTop']

I've tried with both grib1 and grib2 file types and it seems the formatting is incorrect for all the files I've tried. Any suggestions?

@bbonenfant we only support GRIB files with a single typeOfLevel for now. Can you filter all message with a defined typeOfLevel , e.g. cloudBase and then save them to a new GRIB file? That should be simple enough for cfgrib to open it.

To be more precise, we support only one typeOfLevel per data variable identified by paramId, so we don't support files that look like this description.

@bbonenfant I added in the master branch a filter_by_keys keyword argument to open_dataset so you can perform some basic filtering of the GRIB file before attempting to build the CDM hypercubes. So now working with complex files is cumbersome, but doable.

Some examples:

>>> from cfgrib.xarray_store import open_dataset
>>> open_dataset('nam.t00z.awip1200.tm00.grib2',
...     filter_by_keys={'typeOfLevel': 'cloudBase'})
<xarray.Dataset>
Dimensions:     (x: 614, y: 428)
Coordinates:
    time        datetime64[ns] ...
    step        timedelta64[ns] ...
    cloudBase   int64 ...
    latitude    (y, x) float64 ...
    longitude   (y, x) float64 ...
    valid_time  datetime64[ns] ...
Dimensions without coordinates: x, y
Data variables:
    pres        (y, x) float32 ...
Attributes:
    GRIB_edition:            2
    GRIB_centre:             kwbc
    GRIB_centreDescription:  US National Weather Service - NCEP 
    GRIB_subCentre:          0
    history:                 GRIB to CDM+CF via cfgrib-0.8.2.dev0/ecCodes-2.7.0
>>> open_dataset('nam.t00z.awip1200.tm00.grib2',
...     filter_by_keys={'typeOfLevel': 'surface', 'stepType': 'instant'})
<xarray.Dataset>
Dimensions:     (x: 614, y: 428)
Coordinates:
    time        datetime64[ns] ...
    step        timedelta64[ns] ...
    surface     int64 ...
    latitude    (y, x) float64 ...
    longitude   (y, x) float64 ...
    valid_time  datetime64[ns] ...
Dimensions without coordinates: x, y
Data variables:
    vis         (y, x) float32 ...
    gust        (y, x) float32 ...
    hindex      (y, x) float32 ...
    sp          (y, x) float32 ...
    orog        (y, x) float32 ...
    t           (y, x) float32 ...
    unknown     (y, x) float32 ...
    sdwe        (y, x) float32 ...
    sde         (y, x) float32 ...
    prate       (y, x) float32 ...
    sr          (y, x) float32 ...
    veg         (y, x) float32 ...
    slt         (y, x) float32 ...
    lsm         (y, x) float32 ...
    ci          (y, x) float32 ...
    al          (y, x) float32 ...
    sst         (y, x) float32 ...
    shtfl       (y, x) float32 ...
    lhtfl       (y, x) float32 ...
Attributes:
    GRIB_edition:            2
    GRIB_centre:             kwbc
    GRIB_centreDescription:  US National Weather Service - NCEP 
    GRIB_subCentre:          0
    history:                 GRIB to CDM+CF via cfgrib-0.8.2.dev0/ecCodes-2.7.0
>>> open_dataset('nam.t00z.awip1200.tm00.grib2',
...     filter_by_keys={'typeOfLevel': 'isobaricInhPa', 'shortName': 'absv'})
<xarray.Dataset>
Dimensions:       (air_pressure: 5, x: 614, y: 428)
Coordinates:
    time          datetime64[ns] ...
    step          timedelta64[ns] ...
  * air_pressure  (air_pressure) float64 1e+03 850.0 700.0 500.0 250.0
    latitude      (y, x) float64 ...
    longitude     (y, x) float64 ...
    valid_time    datetime64[ns] ...
Dimensions without coordinates: x, y
Data variables:
    absv          (air_pressure, y, x) float32 ...
Attributes:
    GRIB_edition:            2
    GRIB_centre:             kwbc
    GRIB_centreDescription:  US National Weather Service - NCEP 
    GRIB_subCentre:          0
    history:                 GRIB to CDM+CF via cfgrib-0.8.2.dev0/ecCodes-2.7.0

Not all variables are accessible, yet.

Other useful filter_by_keys combinations:

>>> open_dataset('nam.t00z.awip1200.tm00.grib2',
...     filter_by_keys={'typeOfLevel': 'heightAboveGround', 'topLevel': 2})
<xarray.Dataset>
Dimensions:            (x: 614, y: 428)
Coordinates:
    time               datetime64[ns] ...
    step               timedelta64[ns] ...
    heightAboveGround  int64 ...
    latitude           (y, x) float64 ...
    longitude          (y, x) float64 ...
    valid_time         datetime64[ns] ...
Dimensions without coordinates: x, y
Data variables:
    t2m                (y, x) float32 ...
    q                  (y, x) float32 ...
    d2m                (y, x) float32 ...
    r2                 (y, x) float32 ...
Attributes:
    GRIB_edition:            2
    GRIB_centre:             kwbc
    GRIB_centreDescription:  US National Weather Service - NCEP 
    GRIB_subCentre:          0
    history:                 GRIB to CDM+CF via cfgrib-0.8.2.dev0/ecCodes-2.7.0
>>> open_dataset('nam.t00z.awip1200.tm00.grib2',
...     filter_by_keys={'typeOfLevel': 'heightAboveGround', 'topLevel': 10})
<xarray.Dataset>
Dimensions:            (x: 614, y: 428)
Coordinates:
    time               datetime64[ns] ...
    step               timedelta64[ns] ...
    heightAboveGround  int64 ...
    latitude           (y, x) float64 ...
    longitude          (y, x) float64 ...
    valid_time         datetime64[ns] ...
Dimensions without coordinates: x, y
Data variables:
    u10                (y, x) float32 ...
    pt                 (y, x) float32 ...
    q                  (y, x) float32 ...
Attributes:
    GRIB_edition:            2
    GRIB_centre:             kwbc
    GRIB_centreDescription:  US National Weather Service - NCEP 
    GRIB_subCentre:          0
    history:                 GRIB to CDM+CF via cfgrib-0.8.2.dev0/ecCodes-2.7.0

Thank you for such a substantial response to my issue. I look forward to seeing how this project progresses.

Thanks, also bumped into this while trying to read a GFS grib2 file, e.g.

import cfgrib
ds = cfgrib.Dataset.frompath('gfs_4_20110807_0000_000.grb2')
# snipped traceback
~/anaconda3/lib/python3.6/site-packages/cfgrib/dataset.py in enforce_unique_attributes(index, attributes_keys)
    113         values = index[key]
    114         if len(values) > 1:
--> 115             raise ValueError("multiple values for unique attribute %r: %r" % (key, values))
    116         if values and values[0] not in ('undef', 'unknown'):
    117             attributes['GRIB_' + key] = values[0]

ValueError: multiple values for unique attribute 'typeOfLevel': ['isobaricInhPa', 'tropopause', 'maxWind', 'isothermZero', 'unknown', 'potentialVorticity']

The work around seems to work, but hits another snag for this particular data example, i.e.

ds = cfgrib.Dataset.frompath('gfs_4_20110807_0000_000.grb2', filter_by_keys={'typeOfLevel': 'isobaricInhPa'})
# snipped traceback
~/anaconda3/lib/python3.6/site-packages/cfgrib/dataset.py in build_dataset_components(stream, encode_parameter, encode_time, encode_vertical, encode_geography, filter_by_keys)
    374         vars = collections.OrderedDict([(short_name, data_var)])
    375         vars.update(coord_vars)
--> 376         dict_merge(dimensions, dims)
    377         dict_merge(variables, vars)
    378     attributes = enforce_unique_attributes(index, GLOBAL_ATTRIBUTES_KEYS)

~/anaconda3/lib/python3.6/site-packages/cfgrib/dataset.py in dict_merge(master, update)
    353         else:
    354             raise ValueError("key present and new value is different: "
--> 355                              "key=%r value=%r new_value=%r" % (key, master[key], value))
    356 
    357 

ValueError: key present and new value is different: key='air_pressure' value=26 new_value=25

It's not easy to figure out if this is cfgrib or the data is not conforming.

@darrenleeweber from your report I think the the GRIB file has two variables with a pressure dimension, but on different pressure levels and in fact we don't support that at the moment. This a different issue than this one and it is better tracked as such, see #13.

In master we now have the experimental cfgrib.open_datasets entry-point that returns a list of xr.Dataset built selecting appropriate filter_by_keys using a simple heuristics. All examples of complex GRIB files that I have work and return all the variables, except variables that trigger #13 that get skipped.

For example:

>>> cfgrib.open_datasets('nam.t00z.awp21100.tm00.grib2')

/src/cfgrib/xarray_store.py:177: FutureWarning: open_datasets is experimental. It may be removed.
  warnings.warn("open_datasets is experimental. It may be removed.", FutureWarning)
skipping variable with paramId==3041 shortName='absv'
Traceback (most recent call last):
  File "/src/cfgrib/dataset.py", line 385, in build_dataset_components
    dict_merge(dimensions, dims)
  File "/src/cfgrib/dataset.py", line 362, in dict_merge
    "key=%r value=%r new_value=%r" % (key, master[key], value))
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='air_pressure' value=19 new_value=5

[<xarray.Dataset>
 Dimensions:       (air_pressure: 19, x: 93, y: 65)
 Coordinates:
     time          datetime64[ns] ...
     step          timedelta64[ns] ...
   * air_pressure  (air_pressure) float64 1e+03 950.0 900.0 ... 200.0 150.0 100.0
     latitude      (y, x) float64 ...
     longitude     (y, x) float64 ...
     valid_time    datetime64[ns] ...
 Dimensions without coordinates: x, y
 Data variables:
     gh            (air_pressure, y, x) float32 ...
     t             (air_pressure, y, x) float32 ...
     r             (air_pressure, y, x) float32 ...
     w             (air_pressure, y, x) float32 ...
     u             (air_pressure, y, x) float32 ...
 Attributes:
     GRIB_edition:            2
     GRIB_centre:             kwbc
     GRIB_centreDescription:  US National Weather Service - NCEP 
     GRIB_subCentre:          0
     history:                 GRIB to CDM+CF via cfgrib-0.8.5.2.dev0/ecCodes-2...,
 <xarray.Dataset>
 Dimensions:     (x: 93, y: 65)
 Coordinates:
     time        datetime64[ns] ...
     step        timedelta64[ns] ...
     cloudBase   int64 ...
     latitude    (y, x) float64 ...
     longitude   (y, x) float64 ...
     valid_time  datetime64[ns] ...
 Dimensions without coordinates: x, y
 Data variables:
     pres        (y, x) float32 ...
     gh          (y, x) float32 ...
 Attributes:
     GRIB_edition:            2
     GRIB_centre:             kwbc
     GRIB_centreDescription:  US National Weather Service - NCEP 
     GRIB_subCentre:          0
     history:                 GRIB to CDM+CF via cfgrib-0.8.5.2.dev0/ecCodes-2...,
 <xarray.Dataset>
 Dimensions:     (x: 93, y: 65)
 Coordinates:
     time        datetime64[ns] ...
     step        timedelta64[ns] ...
     cloudTop    int64 ...
     latitude    (y, x) float64 ...
     longitude   (y, x) float64 ...
     valid_time  datetime64[ns] ...
 Dimensions without coordinates: x, y
 Data variables:
     pres        (y, x) float32 ...
     gh          (y, x) float32 ...
     t           (y, x) float32 ...
 Attributes:
     GRIB_edition:            2
     GRIB_centre:             kwbc
     GRIB_centreDescription:  US National Weather Service - NCEP 
     GRIB_subCentre:          0
     history:                 GRIB to CDM+CF via cfgrib-0.8.5.2.dev0/ecCodes-2...,
 <xarray.Dataset>
 Dimensions:     (x: 93, y: 65)
 Coordinates:
     time        datetime64[ns] ...
     step        timedelta64[ns] ...
     maxWind     int64 ...
     latitude    (y, x) float64 ...
     longitude   (y, x) float64 ...
     valid_time  datetime64[ns] ...
 Dimensions without coordinates: x, y
 Data variables:
     pres        (y, x) float32 ...
     gh          (y, x) float32 ...
     u           (y, x) float32 ...
 Attributes:
     GRIB_edition:            2
     GRIB_centre:             kwbc
     GRIB_centreDescription:  US National Weather Service - NCEP 
     GRIB_subCentre:          0
     history:                 GRIB to CDM+CF via cfgrib-0.8.5.2.dev0/ecCodes-2...,
 <xarray.Dataset>
 Dimensions:       (x: 93, y: 65)
 Coordinates:
     time          datetime64[ns] ...
     step          timedelta64[ns] ...
     isothermZero  int64 ...
     latitude      (y, x) float64 ...
     longitude     (y, x) float64 ...
     valid_time    datetime64[ns] ...
 Dimensions without coordinates: x, y
 Data variables:
     gh            (y, x) float32 ...
     r             (y, x) float32 ...
 Attributes:
     GRIB_edition:            2
     GRIB_centre:             kwbc
     GRIB_centreDescription:  US National Weather Service - NCEP 
     GRIB_subCentre:          0
     history:                 GRIB to CDM+CF via cfgrib-0.8.5.2.dev0/ecCodes-2...]

As soon as I find the time to sync the Advanced usage section of the README I'll publish release 0.8.5.1 with the update.

Just a quick question about your approach...why do you return a list of xarray datasets? Why not combine them into a single dataset?

@rabernat an xarray.Dataset cannot represent a GRIB file that contains more than one hypercube of the same variable, in the example above look at t that has both air_pressure and cloudTop as vertical coordinates.

Note that a NetCDF Dataset can represent such a GRIB file according the NetCDF Data Model as long as you place different hypercubes in different Groups. The fact that an xarray.Dataset is really a map of one of the NetCDF Groups inside a NetCDF Dataset (a file) is probably a bit confusing, but this is what it is. Note the how the group argument of xarray.open_dataset work.

I prefer not to arbitrarily change the variable names (t, t1, ...) and using the same variable name for all hypercubes trying to xarray.merge the datasets you get:
MergeError: conflicting values for variable 't' on objects to be combined

Understood. So the different items in this list correspond to datasets with different vertical sampling schemes?

Not just that. You may have a GRIB file containing a variable with one message with gridType equal to regular_ll and the following message with regular_gg, you will get two datasets in this case as well.

Basically we have a list of GRIB keys that are required to be identical on all messages of a hypercube.

Would it make sense to return something more like a dict of Dataset objects? It's hard to predictably program with lists where the order of entries is arbitrary.

I've tried with GRIB2 files relative to MSG CloudMask products (that you can download from the EUMETSAT DataCentre at the following link https://www.eumetsat.int/website/home/Data/DataDelivery/EUMETSATDataCentre/index.html), but I've got an error because there is no attribute "step"

@shoyer I see your point, and that would be consistent with the idea that the different xr.Dataset's correspond to different "NetCDF Groups" in the GRIB file. But the heuristics I use for defining the the xr.Dataset's is a bit arbitrary, that is: a file is opened consistently, but you may potentially end up with different datasets if you change the order of the messages in the GRIB file.

At the moment the unique identifier of a xr.Dataset's is the filter_by_key dictionary, that can potentially contain several keys and makes for a very long group name. I'll try to come up with a proposal.

@mdbmdb74 woops! Those GRIB files crash cfgrib in several ways!

It should not be too hard to fix, thou. We just need to relax assumptions on what coordinate need to be present. Apparently both forecast_period/step and the vertical coordinate need to be made optional.

@mdbmdb74 current master treats vertical and forecast_period coordinates as optional and can open the EUMETSAT GRIB2 files:

>>> ds = cfgrib.open_dataset('MSG4-SEVI-MSGCLMK-0100-0100-20180930100000.000000000Z-20180930101421-1296606.grb')
No latitudes/longitudes provided by ecCodes for gridType = 'space_view'
>>> ds
<xarray.Dataset>
Dimensions:     (i: 13778944)
Coordinates:
    time        datetime64[ns] ...
    valid_time  datetime64[ns] ...
Dimensions without coordinates: i
Data variables:
    p260537     (i) float32 ...
Attributes:
    GRIB_edition:            2
    GRIB_centre:             eums
    GRIB_centreDescription:  EUMETSAT Operation Centre
    GRIB_subCentre:          0
    history:                 GRIB to CDM+CF via cfgrib-0.8.5.2.dev0/ecCodes-2...
>>> ds['p260537']
<xarray.DataArray 'p260537' (i: 13778944)>
[13778944 values with dtype=float32]
Coordinates:
    time        datetime64[ns] ...
    valid_time  datetime64[ns] ...
Dimensions without coordinates: i
Attributes:
    GRIB_paramId:                    260537
    GRIB_shortName:                  ~
    GRIB_units:                      Code table 4.217
    GRIB_name:                       Cloud mask
    GRIB_cfVarName:                  p260537
    GRIB_dataType:                   sa
    GRIB_missingValue:               9999
    GRIB_numberOfPoints:             13778944
    GRIB_NV:                         0
    GRIB_stepType:                   instant
    GRIB_gridType:                   space_view
    GRIB_gridDefinitionDescription:  Space view perspective orthographic
    long_name:                       Cloud mask
    units:                           Code table 4.217

Unfortunately ecCodes does not seem to support the space_view gridType (so we cannot represent the values on a 2D grid) nor the cloud mask parameter (units == Code table 4.217?!)

I close this issue with the release of version 0.9.0.

Sorry to reopen this issue -- feel free to move this somewhere else, but I think I may have found a bug or at least something I do not understand in the implementation of the filter_by_keys argument.

Here is some output I receive when trying to open one of those NAM grib files:

>>> xr.open_dataset('nam.t06z.awip3d00.tm00.grib2',
                    engine='cfgrib',
                    backend_kwargs={
                        'filter_by_keys': {'typeOfLevel': 'isobaricInhPa'},
                        'errors': 'ignore'
                    })

skipping variable: paramId==3041 shortName='absv'
Traceback (most recent call last):
  File "/Users/bbonenfant/PycharmProjects/wrf2wxl/venv/lib/python3.6/site-packages/cfgrib/dataset.py", line 429, in build_dataset_components
    dict_merge(dimensions, dims)
  File "/Users/bbonenfant/PycharmProjects/wrf2wxl/venv/lib/python3.6/site-packages/cfgrib/dataset.py", line 407, in dict_merge
    "key=%r value=%r new_value=%r" % (key, master[key], value))
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='isobaricInhPa' value=39 new_value=7
skipping variable: paramId==1 shortName='strf'
Traceback (most recent call last):
  File "/Users/bbonenfant/PycharmProjects/wrf2wxl/venv/lib/python3.6/site-packages/cfgrib/dataset.py", line 430, in build_dataset_components
    dict_merge(variables, vars)
  File "/Users/bbonenfant/PycharmProjects/wrf2wxl/venv/lib/python3.6/site-packages/cfgrib/dataset.py", line 407, in dict_merge
    "key=%r value=%r new_value=%r" % (key, master[key], value))
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='isobaricInhPa' value=Variable(dimensions=('isobaricInhPa',), data=array([1000,  975,  950,  925,  900,  875,  850,  825,  800,  775,  750,
        725,  700,  675,  650,  625,  600,  575,  550,  525,  500,  475,
        450,  425,  400,  375,  350,  325,  300,  275,  250,  225,  200,
        175,  150,  125,  100,   75,   50])) new_value=Variable(dimensions=(), data=250)
skipping variable: paramId==3017 shortName='dpt'
Traceback (most recent call last):
  File "/Users/bbonenfant/PycharmProjects/wrf2wxl/venv/lib/python3.6/site-packages/cfgrib/dataset.py", line 429, in build_dataset_components
    dict_merge(dimensions, dims)
  File "/Users/bbonenfant/PycharmProjects/wrf2wxl/venv/lib/python3.6/site-packages/cfgrib/dataset.py", line 407, in dict_merge
    "key=%r value=%r new_value=%r" % (key, master[key], value))
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='isobaricInhPa' value=39 new_value=6
skipping variable: paramId==260022 shortName='mconv'
Traceback (most recent call last):
  File "/Users/bbonenfant/PycharmProjects/wrf2wxl/venv/lib/python3.6/site-packages/cfgrib/dataset.py", line 429, in build_dataset_components
    dict_merge(dimensions, dims)
  File "/Users/bbonenfant/PycharmProjects/wrf2wxl/venv/lib/python3.6/site-packages/cfgrib/dataset.py", line 407, in dict_merge
    "key=%r value=%r new_value=%r" % (key, master[key], value))
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='isobaricInhPa' value=39 new_value=2

<xarray.Dataset>
Dimensions:        (isobaricInhPa: 39, x: 185, y: 129)
Coordinates:
    time           datetime64[ns] ...
    step           timedelta64[ns] ...
  * isobaricInhPa  (isobaricInhPa) int64 1000 975 950 925 900 ... 125 100 75 50
    latitude       (y, x) float64 ...
    longitude      (y, x) float64 ...
    valid_time     datetime64[ns] ...
Dimensions without coordinates: x, y
Data variables:
    gh             (isobaricInhPa, y, x) float32 ...
    t              (isobaricInhPa, y, x) float32 ...
    r              (isobaricInhPa, y, x) float32 ...
    q              (isobaricInhPa, y, x) float32 ...
    w              (isobaricInhPa, y, x) float32 ...
    u              (isobaricInhPa, y, x) float32 ...
    tke            (isobaricInhPa, y, x) float32 ...
    clwmr          (isobaricInhPa, y, x) float32 ...
    cice           (isobaricInhPa, y, x) float32 ...
    snmr           (isobaricInhPa, y, x) float32 ...
    strf           (y, x) float32 ...
Attributes:
    GRIB_edition:            2
    GRIB_centre:             kwbc
    GRIB_centreDescription:  US National Weather Service - NCEP 
    GRIB_subCentre:          0
    Conventions:             CF-1.7
    institution:             US National Weather Service - NCEP 
    history:                 GRIB to CDM+CF via cfgrib-0.9.5.1/ecCodes-2.8.0 ...

You can see that it successfully returns a Dataset, but looking at the variables it returns, there is the U component of the wind but not the V component of the wind. I'm not sure why this is the case, since I've inspected the grib and find nothing apparent wrong with the v-winds. I've additionally tried this on other NAM gribs with similar results (even in your comment above on Sept. 30 this was the case).

I am unsure if this is an error on my part or if there is a way around this.
Thank you.

@bbonenfant thanks for your help! I opened a new issue with your comment, and leave this one closed as I consider the general problem solved by filter_by_key.

@darrenleeweber I started to learn grib data format and faced with the same issue. Use it like this;

import xarray as xr

path = "gfsanl_4_2019101000.g2"
os.chdir(path)
ds = xr.open_dataset('gfs_4_20191010_0000_000.grb2',
engine='cfgrib',
backend_kwargs=dict(filter_by_keys={'typeOfLevel': 'hybrid'}))
print(ds)

The v component of GRIB files that use the MULTI-FIELD feature is read correctly only starting with version 0.9.8.2, see: https://github.com/ecmwf/cfgrib/blob/master/CHANGELOG.rst#0982-2020-05-22

Sorry to bring this up but I think it's a usage question around this functionality. Was trying this today to get u10 from GFS via ftp. If you like me to ask this someone else please let me know.

(xr 0.16.2 and cfgrib 0.9.8.5)

$ wget ftp://ftp.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gfs.20210212/00/gfs.t00z.pgrb2.0p25.f000
import xarray as xr
ds = xr.open_mfdataset(
    "gfs.t00z.pgrb2.0p25.f000",
    engine="cfgrib",
    backend_kwargs=dict(filter_by_keys={"typeOfLevel": "heightAboveGround"}),
)

output is

skipping variable: paramId==165 shortName='u10'
Traceback (most recent call last):
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 602, in build_dataset_components
    dict_merge(variables, coord_vars)
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 536, in dict_merge
    raise DatasetBuildError(
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=(), data=2) new_value=Variable(dimensions=(), data=10)
skipping variable: paramId==166 shortName='v10'
Traceback (most recent call last):
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 602, in build_dataset_components
    dict_merge(variables, coord_vars)
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 536, in dict_merge
    raise DatasetBuildError(
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=(), data=2) new_value=Variable(dimensions=(), data=10)
skipping variable: paramId==131 shortName='u'
Traceback (most recent call last):
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 602, in build_dataset_components
    dict_merge(variables, coord_vars)
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 536, in dict_merge
    raise DatasetBuildError(
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=(), data=2) new_value=Variable(dimensions=('heightAboveGround',), data=array([20, 30, 40, 50, 80]))
skipping variable: paramId==132 shortName='v'
Traceback (most recent call last):
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 602, in build_dataset_components
    dict_merge(variables, coord_vars)
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 536, in dict_merge
    raise DatasetBuildError(
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=(), data=2) new_value=Variable(dimensions=('heightAboveGround',), data=array([20, 30, 40, 50, 80]))
skipping variable: paramId==130 shortName='t'
Traceback (most recent call last):
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 602, in build_dataset_components
    dict_merge(variables, coord_vars)
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 536, in dict_merge
    raise DatasetBuildError(
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=(), data=2) new_value=Variable(dimensions=('heightAboveGround',), data=array([ 80, 100]))
skipping variable: paramId==133 shortName='q'
Traceback (most recent call last):
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 602, in build_dataset_components
    dict_merge(variables, coord_vars)
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 536, in dict_merge
    raise DatasetBuildError(
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=(), data=2) new_value=Variable(dimensions=(), data=80)
skipping variable: paramId==54 shortName='pres'
Traceback (most recent call last):
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 602, in build_dataset_components
    dict_merge(variables, coord_vars)
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 536, in dict_merge
    raise DatasetBuildError(
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=(), data=2) new_value=Variable(dimensions=(), data=80)
skipping variable: paramId==228246 shortName='u100'
Traceback (most recent call last):
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 602, in build_dataset_components
    dict_merge(variables, coord_vars)
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 536, in dict_merge
    raise DatasetBuildError(
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=(), data=2) new_value=Variable(dimensions=(), data=100)
skipping variable: paramId==228247 shortName='v100'
Traceback (most recent call last):
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 602, in build_dataset_components
    dict_merge(variables, coord_vars)
  File "/Users/ray.bell/miniconda/envs/test_env/lib/python3.8/site-packages/cfgrib/dataset.py", line 536, in dict_merge
    raise DatasetBuildError(
cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=(), data=2) new_value=Variable(dimensions=(), data=100)

and see

>>> ds
<xarray.Dataset>
Dimensions:            (latitude: 721, longitude: 1440)
Coordinates:
    time               datetime64[ns] ...
    step               timedelta64[ns] ...
    heightAboveGround  int64 ...
  * latitude           (latitude) float64 90.0 89.75 89.5 ... -89.5 -89.75 -90.0
  * longitude          (longitude) float64 0.0 0.25 0.5 ... 359.2 359.5 359.8
    valid_time         datetime64[ns] ...
Data variables:
    t2m                (latitude, longitude) float32 dask.array<chunksize=(721, 1440), meta=np.ndarray>
    sh2                (latitude, longitude) float32 dask.array<chunksize=(721, 1440), meta=np.ndarray>
    d2m                (latitude, longitude) float32 dask.array<chunksize=(721, 1440), meta=np.ndarray>
    r2                 (latitude, longitude) float32 dask.array<chunksize=(721, 1440), meta=np.ndarray>
    aptmp              (latitude, longitude) float32 dask.array<chunksize=(721, 1440), meta=np.ndarray>
Attributes:
    GRIB_edition:            2
    GRIB_centre:             kwbc
    GRIB_centreDescription:  US National Weather Service - NCEP 
    GRIB_subCentre:          0
    Conventions:             CF-1.7
    institution:             US National Weather Service - NCEP 
    history:                 2021-02-12T14:02:09 GRIB to CDM+CF via cfgrib-0....

Great! Worked for me.