Smart extraction / interpolation of points from large dfsu
Hendrik1987 opened this issue · 4 comments
I still struggle to extract data from larger-then-memory .dfsu files.
Or maybe I am just not aware of the newest way to do this in mikeio?
What I have tried / done so far:
- read only first timestep of .dfsu
- identify n nearest elements
- read all data only for the identified elements
dfs = mikeio.open(fn)
mds0 = dfs.read(time=0)
elements = mds0.geometry.find_nearest_elements(x=pts.x, y=pts.y, n_nearest=3)
elements_load = np.unique(elements)
mds = dfs.read(elements=elements_load)
mds[0].plot()
- loop over the query points and interpolate from n nearest elements:
mdsi = [mds.interp(x=x, y=y, n_nearest=3) for x, y in zip(pts.x, pts.y)]
What I think could be enhanced:
- Make this (or a better) workflow easily available, maybe via
mikeio.read(x, y, n_nearest = 3)
ordfs.extract_stations()
as sister todfs.extract_track()
? - Is there a way to then concat the
mikeio.Datasets
in the listmdsi
, maybe resulting in aGeometryPoint2DCollection
geometry? - How do I write the extracted points to
.dfs
, while maintain geometry information?
Possibly related to fmskill enhancement
- Here I also struggle to extract from large
.dfsu
model results. Maybe some of the same logic applies here? - ModelResult should be able to mikeio.Dataset/DataArray (DHI/modelskill#132)
- No, and I'm not sure it is a good idea to create such a geometry, since the existing geometries Grid1D, Grid2D, Grid3D, GeometryFM, GeometryFM3D, corresponds to different types of dfs files. I don't think we have an appropriate file format for a
GeometryPoint2DCollection
. - It is possible to store the location for each item in a Dfs0 file (see mikecore example), but it is still not as flexible as a NetCDF, where you could have two dimensions (station, time) since there is no way of storing the name of the station other than in the name of the item e.g.
"Buoy 2: Sign. Wave Height"
or other station specific metadata.
- And what about this point... ? :-)
- I see and understand the motivation.
- Yes, the
"Meta_data: Parameter"
is also the work around used when outputting.dfs0
directly from the MIKE model. I don't think it is a nice solution and maybe the .nc or the .dfs0 + .json or similar is the better choice.
However, at the moment theGeometryPoint2D
coordinates seem to be lost on.to_xarray()
. So also after conversion I can currently not easily concat thexarray.DataSets
.
3, See PR #483 should take you one step closer to being able to concatenate data
Thanks, that is a fix for 3, @ecomodeller
Taking a step back here, what I generally try to achieve is an interpolation along a new dimension. To do the same on a structured NetCDF file I would use this xarray
equivalent.
In my special case I have a list of x, y
coordinates and my new dimension would be points
. And all the steps above are only my current solution, also handling a larger-than-memory dfsu file.
I understand the motivation to keep geometry
types and .dfs
files aligned.
Is the conclusion then to work on a dfsu.extract_points((x, y), n_nearest) -> xr.Dataset
, where (x, y)
can be a list or array of coordinates? (or maybe a df
, allowing to specify further coordinates such as station_name
)