ecmwf/anemoi-datasets

Specify groups when reading NetCDF files

Closed this issue · 4 comments

Is your feature request related to a problem? Please describe.

I have a NetCDF file and the important information is in a group called "measurements". Would it be possible to specify this in the recipe.yml?
If I do not specify this somehow, anemoi does not work because it says it does not find data, which is true because it is not searching in the groups.

Describe the solution you'd like

I would like to be able to define it in the recipe.yml files, something like:

dates:
  start: 2022-01-30T00:00:00Z
  end: 2022-02-31T19:00:00Z

input:
  netcdf:
    path: sample_groups.nc
    groups: measurements

Describe alternatives you've considered

No response

Additional context

This is how I would access the code in using xarray in python code:

import xarray as xr
path="sample_groups.nc"
# Load NetCDF file directly into an xarray
ds = xr.open_dataset(path, group="measurements")

ds

Here you have a sample dataset where you can try it.
https://wekeo-files.prod.wekeo2.eu/index.php/s/caTNZZXR2GF6pJY

Organisation

No response

Should it be group or groups? Do you expect to have data from several groups simultaneously?

Thanks for your message.
All relevant information for normal users are in the group "measurements", so I think it could be simplified to extract it from only one group.

I had a look a the group measurement. The "time" variable is not a coordinate. It is also missing attributes.

This is possible in the latest version:

input:
    netcdf:
         path:  ...
         group: ...