Investigate performance on long timeseries
BSchilperoort opened this issue · 2 comments
Your work in reimplementing regridding methods is interesting! I often find
ESMF
to be a bit heavy on dependencies side indeed. But one crucial feature that makes it better thanxr.interp
is that it computes weights first and then applies them in parallel for all spatial slices. This is a major dealbreaker when dealing with long timeseries! Do you have plans on implementing something like that in pure-xarray ?
Originally posted by @aulemahal in pangeo-data/xESMF#282 (comment)
It should be easy to compute weights and apply them, the same way that the conservative method currently does. This way we could possibly improve performance for long timeseries.
I don't think this is actually an issue due to the way interpolation is applied independently across the dimensions. In some quick benchmarks on a longer timeseries, regrid.linear
is anywhere from 1.2x to 4x faster than regrid.conservative
depending on the chunk scheme. So I doubt we would get any improvement by instead generating our own weights and doing .dot()
.
Plus its pretty cool that xarray_regrid/methods/interp.py
is only 52 lines, most of which are type overloads 😄