Extracting data from a large .dfs2 file
Closed this issue · 4 comments
Hi there
I don't know if there is a solution to my problem, but I will try to tell my problem anyway.
I have a large DFS2 file containing 1440 time steps with a 1 minute resolution, which adds up to a day. Furthermore the spatial resolution of cells is nx = 401 * ny = 401
with dx = dy = 250
. The amount of data is hereby 401 x 401 x 1440 = 231.553.440
cell values and have the size of around 1 GB. Whenever I try to read this with mikeio
, this will take around 25 seconds.
import mikeio
dfs2 = mikeio.Dfs2(filename='data.dfs2')
ds = dfs2.read()
Now, I know I want to spatially trim this file, so it works for a smaller region, containing only nx = 20
, ny = 43
. This will decrease the amount of data and size significantly I and I know a way to do it. I noticed the area
parameter in the read(...)
method, which let me choose the proper bounding box coordinates.
ds_small = dfs2.read(area=tuple(left,lower,right,upper))
However, "reading" this decreased amount of data (ds_small
) takes just as long as "reading" the original file (ds
). How come? I thought the specified bounding box corresponds to the reading/loading time? Obviously not.
Despite my disappointment, am I doing it right, or is there another way to decrease reading time?
Thanks in advance.
Hi @ecomodeller. Are you saying that the argument area
doesn't work as intended?
The area
argument allows you to read a subset, also from a file which wouldn't fit in memory, so it works as intended, but it is far from an optimal solution.
The problem is here:
Line 243 in 877c996
We can subset items and time, but not space.
This would have to be added in mikecore
and actually in the ufs
C library.
Okay, I understand.
I am glad, though, that the current solution works and use less memory than loading the original file.
I guess you can close the issue now.