Unable to merge multiple Grib files with specified variable name
meteoDaniel opened this issue · 0 comments
meteoDaniel commented
What happened?
I am open a list of grib files (arome meteoe france SP2 grib packages ), and when I specify a shortName or the name of the variable, I receive a xarray.MergeError
. But when I open multiple variable by just specifying {'stepType': 'instant'}
all works fine.
This behaviour is very curious and i do not know how to debug this issue.
What are the steps to reproduce the bug?
In [2]: self.files_per_grib_package[grib_package_to_use]
Out[2]:
[PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_0.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_1.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_2.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_3.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_4.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_5.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_6.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_7.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_8.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_9.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_10.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_11.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_12.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_13.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_14.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_15.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_16.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_17.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_18.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_19.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_20.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_21.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_22.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_23.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_24.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_25.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_26.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_27.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_28.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_29.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_30.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_31.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_32.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_33.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_34.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_35.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_36.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_37.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_38.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_39.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_40.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_41.grib2'),
PosixPath('/app/data/arome_meteo_france/20240612_00/arome_meteo_france_20240612_00_SP2_42.grib2')]
In [3]: data = xarray.open_mfdataset(
...: self.files_per_grib_package[grib_package_to_use],
...: engine="cfgrib",
...: parallel=True,
...: concat_dim="step",
...: combine="nested",
...: backend_kwargs={
...: "indexpath": "",
...: "errors": "ignore",
...: "filter_by_keys": {'shortName': 'lcc', 'stepType': 'instant'}
...: # "filter_by_keys": FILTER_ARGUMENT[variable],
...: },
...: )
---------------------------------------------------------------------------
MergeError Traceback (most recent call last)
Cell In[3], line 1
----> 1 data = xarray.open_mfdataset(
2 self.files_per_grib_package[grib_package_to_use],
3 engine="cfgrib",
4 parallel=True,
5 concat_dim="step",
6 combine="nested",
7 backend_kwargs={
8 "indexpath": "",
9 "errors": "ignore",
10 "filter_by_keys": {'shortName': 'lcc', 'stepType': 'instant'}
11 # "filter_by_keys": FILTER_ARGUMENT[variable],
12 },
13 )
File /usr/local/lib/python3.10/site-packages/xarray/backends/api.py:1071, in open_mfdataset(paths, chunks, concat_dim, compat, preprocess, engine, data_vars, coords, combine, parallel, join, attrs_file, combine_attrs, **kwargs)
1067 try:
1068 if combine == "nested":
1069 # Combined nested list by successive concat and merge operations
1070 # along each dimension, using structure given by "ids"
-> 1071 combined = _nested_combine(
1072 datasets,
1073 concat_dims=concat_dim,
1074 compat=compat,
1075 data_vars=data_vars,
1076 coords=coords,
1077 ids=ids,
1078 join=join,
1079 combine_attrs=combine_attrs,
1080 )
1081 elif combine == "by_coords":
1082 # Redo ordering from coordinates, ignoring how they were ordered
1083 # previously
1084 combined = combine_by_coords(
1085 datasets,
1086 compat=compat,
(...)
1090 combine_attrs=combine_attrs,
1091 )
File /usr/local/lib/python3.10/site-packages/xarray/core/combine.py:356, in _nested_combine(datasets, concat_dims, compat, data_vars, coords, ids, fill_value, join, combine_attrs)
353 _check_shape_tile_ids(combined_ids)
355 # Apply series of concatenate or merge operations along each dimension
--> 356 combined = _combine_nd(
357 combined_ids,
358 concat_dims,
359 compat=compat,
360 data_vars=data_vars,
361 coords=coords,
362 fill_value=fill_value,
363 join=join,
364 combine_attrs=combine_attrs,
365 )
366 return combined
File /usr/local/lib/python3.10/site-packages/xarray/core/combine.py:232, in _combine_nd(combined_ids, concat_dims, data_vars, coords, compat, fill_value, join, combine_attrs)
228 # Each iteration of this loop reduces the length of the tile_ids tuples
229 # by one. It always combines along the first dimension, removing the first
230 # element of the tuple
231 for concat_dim in concat_dims:
--> 232 combined_ids = _combine_all_along_first_dim(
233 combined_ids,
234 dim=concat_dim,
235 data_vars=data_vars,
236 coords=coords,
237 compat=compat,
238 fill_value=fill_value,
239 join=join,
240 combine_attrs=combine_attrs,
241 )
242 (combined_ds,) = combined_ids.values()
243 return combined_ds
File /usr/local/lib/python3.10/site-packages/xarray/core/combine.py:267, in _combine_all_along_first_dim(combined_ids, dim, data_vars, coords, compat, fill_value, join, combine_attrs)
265 combined_ids = dict(sorted(group))
266 datasets = combined_ids.values()
--> 267 new_combined_ids[new_id] = _combine_1d(
268 datasets, dim, compat, data_vars, coords, fill_value, join, combine_attrs
269 )
270 return new_combined_ids
File /usr/local/lib/python3.10/site-packages/xarray/core/combine.py:290, in _combine_1d(datasets, concat_dim, compat, data_vars, coords, fill_value, join, combine_attrs)
288 if concat_dim is not None:
289 try:
--> 290 combined = concat(
291 datasets,
292 dim=concat_dim,
293 data_vars=data_vars,
294 coords=coords,
295 compat=compat,
296 fill_value=fill_value,
297 join=join,
298 combine_attrs=combine_attrs,
299 )
300 except ValueError as err:
301 if "encountered unexpected variable" in str(err):
File /usr/local/lib/python3.10/site-packages/xarray/core/concat.py:250, in concat(objs, dim, data_vars, coords, compat, positions, fill_value, join, combine_attrs)
238 return _dataarray_concat(
239 objs,
240 dim=dim,
(...)
247 combine_attrs=combine_attrs,
248 )
249 elif isinstance(first_obj, Dataset):
--> 250 return _dataset_concat(
251 objs,
252 dim=dim,
253 data_vars=data_vars,
254 coords=coords,
255 compat=compat,
256 positions=positions,
257 fill_value=fill_value,
258 join=join,
259 combine_attrs=combine_attrs,
260 )
261 else:
262 raise TypeError(
263 "can only concatenate xarray Dataset and DataArray "
264 f"objects, got {type(first_obj)}"
265 )
File /usr/local/lib/python3.10/site-packages/xarray/core/concat.py:524, in _dataset_concat(datasets, dim, data_vars, coords, compat, positions, fill_value, join, combine_attrs)
518 if variables_to_merge:
519 grouped = {
520 k: v
521 for k, v in collect_variables_and_indexes(datasets).items()
522 if k in variables_to_merge
523 }
--> 524 merged_vars, merged_indexes = merge_collected(
525 grouped, compat=compat, equals=equals
526 )
527 result_vars.update(merged_vars)
528 result_indexes.update(merged_indexes)
File /usr/local/lib/python3.10/site-packages/xarray/core/merge.py:290, in merge_collected(grouped, prioritized, compat, combine_attrs, equals)
288 variables = [variable for variable, _ in elements_list]
289 try:
--> 290 merged_vars[name] = unique_variable(
291 name, variables, compat, equals.get(name, None)
292 )
293 except MergeError:
294 if compat != "minimal":
295 # we need more than "minimal" compatibility (for which
296 # we drop conflicting coordinates)
File /usr/local/lib/python3.10/site-packages/xarray/core/merge.py:144, in unique_variable(name, variables, compat, equals)
141 break
143 if not equals:
--> 144 raise MergeError(
145 f"conflicting values for variable {name!r} on objects to be combined. "
146 "You can skip this check by specifying compat='override'."
147 )
149 if combine_method:
150 for var in variables[1:]:
MergeError: conflicting values for variable 'valid_time' on objects to be combined. You can skip this check by specifying compat='override'.
In [4]: data = xarray.open_mfdataset(
...: self.files_per_grib_package[grib_package_to_use],
...: engine="cfgrib",
...: parallel=True,
...: concat_dim="step",
...: combine="nested",
...: backend_kwargs={
...: "indexpath": "",
...: "errors": "ignore",
...: "filter_by_keys": {'stepType': 'instant'}
...: # "filter_by_keys": FILTER_ARGUMENT[variable],
...: },
...: )
In [5]: data
Out[5]:
<xarray.Dataset> Size: 5GB
Dimensions: (step: 43, latitude: 1791, longitude: 2801)
Coordinates:
time datetime64[ns] 8B 2024-06-12
* step (step) timedelta64[ns] 344B 00:00:00 ... 1 days 18:00:00
surface float64 8B 0.0
* latitude (latitude) float64 14kB 55.4 55.39 55.38 ... 37.52 37.51 37.5
* longitude (longitude) float64 22kB -12.0 -11.99 -11.98 ... 15.99 16.0
valid_time (step) datetime64[ns] 344B 2024-06-12 ... 2024-06-13T18:00:00
level float64 8B 0.0
Data variables:
sp (step, latitude, longitude) float32 863MB dask.array<chunksize=(1, 1791, 2801), meta=np.ndarray>
CAPE_INS (step, latitude, longitude) float32 863MB dask.array<chunksize=(1, 1791, 2801), meta=np.ndarray>
lcc (step, latitude, longitude) float32 863MB dask.array<chunksize=(2, 1791, 2801), meta=np.ndarray>
hcc (step, latitude, longitude) float32 863MB dask.array<chunksize=(2, 1791, 2801), meta=np.ndarray>
mcc (step, latitude, longitude) float32 863MB dask.array<chunksize=(2, 1791, 2801), meta=np.ndarray>
unknown (step, latitude, longitude) float32 863MB dask.array<chunksize=(2, 1791, 2801), meta=np.ndarray>
Attributes:
GRIB_edition: 2
GRIB_centre: lfpw
GRIB_centreDescription: French Weather Service - Toulouse
GRIB_subCentre: 0
Conventions: CF-1.7
institution: French Weather Service - Toulouse
history: 2024-06-12T05:53 GRIB to CDM+CF via cfgrib-0.9.1...
In [6]:
Version
0.9.12.0
Platform (OS and architecture)
python3.10-slim Docker image
Relevant log output
No response
Accompanying data
https://mf-models-on-aws.org/#arome-france-hd/v1/2024-06-12/00/SP2/
Organisation
No response