Smart extraction / interpolation of points from large dfsu

Question

Smart extraction / interpolation of points from large dfsu

Hendrik1987 opened this issue 2 years ago · 4 comments

Hendrik1987 commented 2 years ago

I still struggle to extract data from larger-then-memory .dfsu files.

Or maybe I am just not aware of the newest way to do this in mikeio?

What I have tried / done so far:

read only first timestep of .dfsu
identify n nearest elements
read all data only for the identified elements

dfs = mikeio.open(fn)
mds0 = dfs.read(time=0)
elements = mds0.geometry.find_nearest_elements(x=pts.x, y=pts.y, n_nearest=3)
elements_load = np.unique(elements)
mds = dfs.read(elements=elements_load)
mds[0].plot()

loop over the query points and interpolate from n nearest elements:

mdsi =  [mds.interp(x=x, y=y, n_nearest=3) for x, y in zip(pts.x, pts.y)]

What I think could be enhanced:

Make this (or a better) workflow easily available, maybe via mikeio.read(x, y, n_nearest = 3) or dfs.extract_stations() as sister to dfs.extract_track()?
Is there a way to then concat the mikeio.Datasets in the list mdsi, maybe resulting in a GeometryPoint2DCollection geometry?
How do I write the extracted points to .dfs, while maintain geometry information?

Possibly related to fmskill enhancement

Here I also struggle to extract from large .dfsu model results. Maybe some of the same logic applies here?
ModelResult should be able to mikeio.Dataset/DataArray (DHI/modelskill#132)

Answer 1 · 2022-11-23T14:14:49.000Z

No, and I'm not sure it is a good idea to create such a geometry, since the existing geometries Grid1D, Grid2D, Grid3D, GeometryFM, GeometryFM3D, corresponds to different types of dfs files. I don't think we have an appropriate file format for a GeometryPoint2DCollection.
It is possible to store the location for each item in a Dfs0 file (see mikecore example), but it is still not as flexible as a NetCDF, where you could have two dimensions (station, time) since there is no way of storing the name of the station other than in the name of the item e.g. "Buoy 2: Sign. Wave Height" or other station specific metadata.

Answer 2 · 2022-11-23T14:25:05.000Z

And what about this point... ? :-)
I see and understand the motivation.
Yes, the "Meta_data: Parameter" is also the work around used when outputting .dfs0 directly from the MIKE model. I don't think it is a nice solution and maybe the .nc or the .dfs0 + .json or similar is the better choice.
However, at the moment the GeometryPoint2D coordinates seem to be lost on .to_xarray(). So also after conversion I can currently not easily concat the xarray.DataSets.

Answer 3 · 2022-11-24T06:04:01.000Z

3, See PR #483 should take you one step closer to being able to concatenate data

Answer 4 · 2022-11-24T09:48:46.000Z

Thanks, that is a fix for 3, @ecomodeller

Taking a step back here, what I generally try to achieve is an interpolation along a new dimension. To do the same on a structured NetCDF file I would use this xarray equivalent.

In my special case I have a list of x, y coordinates and my new dimension would be points. And all the steps above are only my current solution, also handling a larger-than-memory dfsu file.

I understand the motivation to keep geometry types and .dfs files aligned.

Is the conclusion then to work on a dfsu.extract_points((x, y), n_nearest) -> xr.Dataset, where (x, y) can be a list or array of coordinates? (or maybe a df, allowing to specify further coordinates such as station_name)