xarray-contrib/cf-xarray

order in `bounds_to_vertices`

Closed this issue · 4 comments

Hey all,
thank for this, i greatly appreciate cf_xarray! I have one thing, since i work a lot with CORDEX curvilinear grids and 2D coordinates (and bounds). There is some cases in which computation of boundaries might have inprecise results, simply due to floating point limitations, e.g., very basic case i encountered:

21.615 + 0.055 == 21.725 - 0.055  # e.g. right boundary of i-1 should be the same as left boundary of i

This would result in False which means that i can have neighbouring grid cells that do not share excactly the same boundary coordinate. In my case, i even transform the boundaries into a different CRS when i have 2D coordinates and when i order them counterclockwise, bounds_to_vertices with order=None will assum a clockwise order (the check for counterclockwise simply failes due to precision issues). For me this becomes important when using xesmf with conservative remapping, since xesmf relies on "detecting" the order of 4D vertices when using bounds_to_vertices.

I guess this could happen a lot if you have curvilinear CF datasets with boundary shapes of (N, M, 4). So i would suggest to change

order = "counterclockwise" if np.all(calc_bnds == values) else "clockwise"

to something like this:

order = "counterclockwise" if np.allclose(calc_bnds, values) else "clockwise" 

to allow for some tolerance... i have a more detailled gist of what i really do here. I am aware that this "order" is quite tricky to define, but I think it would be more a cf_xarray related issue than xesmf?

I could also try and implement a more sophisticated check for "order" of boundaries, it's really tricky topic..

The change to allclose seems OK to me, but I defer to @aulemahal here. It seems like xesmf should let you explicitly specify order though. Autoguessing should really just be for convenience and should be easy to opt out from.

This change seems ok to me too! The default tolerance values are small enough that it shouldn't cause problems, but still detect floating-point precision issues.

However, xESMF auto-guesses the order in the case where one did not explicitly pass the bounds. Instead of adding a new argument to control the order xESMF uses, I would first suggest to call this conversion before and store the results in the lon_b and lat_b names that xESMF recognizes:

ds['lon_b'] = cfxr.bounds_to_vertices(
    ds.lon_bounds, bounds_dim="vertices", order="counterclockwise"
)
ds['lat_b'] = cfxr.bounds_to_vertices(
    ds.lat_bounds, bounds_dim="vertices", order="counterclockwise"
)
reg = Regridder(ds, ds_out, ...)

See:
https://github.com/pangeo-data/xESMF/blob/d48285b5f65bd97820154f0253386ef69f65a36b/xesmf/frontend.py#L62-L64
It's commented "old way". but still that's the least-guessing way ;).

Thanks @aulemahal! That obvious solution really solves my problem, didn't think of that. Now my regridding works again. I'll do a PR on this issue anyway!