Performance of reading result files

Question

Performance of reading result files

Closed this issue a year ago · 1 comments

The performance of reading result files could be considerably improved.

Here are two examples of reading files on my machine (16 GB of RAM, 4 cores 2.7 GHz, Samsung MZVLW512HMJP drive):

A ~1.1 GB network result file with ~58000 times series and ~4300 time steps currently takes the following time:
1. Load data into ResultData object takes ~20 seconds and ~1.1 GB of memory
2. Reading in memory from ResultData to data frame takes ~220 seconds and ~2.2 GB of memory
A ~1.5 GB catchment result file with ~27000 time series and ~15000 time steps currently takes the following time:
1. Load data into ResultData object takes ~12 seconds and ~1.5 GB of memory
2. Reading in memory from ResultData to data frame takes ~210s and ~3.0 GB of memory

I think the ii. step should be at least a factor of 10 faster, because it deals with copying data in memory. I suspect there is a problem with Python to C# interop.

Answer 1 · 2023-05-09T13:33:51.000Z

The upcoming pull request has the following performance increase for step ii.:

Network result file: ~10 seconds and 1.1 GB of memory
Catchment result file: ~6 seconds and 1.5 GB of memory