tsutterley/pyTMD

nearest_extrap.py cuts model grid independent of cutoff distance

Closed this issue · 3 comments

Hi,
I found a bug in nearest_extrap.py. This program shall find the nearest model data for locations where the model grid itself is NaN. To specify the search range, you specify a "cutoff" distance. However, before searching for nearest neighbors, the model data is pre-filtered to the vicinity of the data points. Here, however, you filter the points to the data locations plus 2x the model grid resolution. If you have a relatively high resolution grid, 2 x dlon/dlat is much smaller than "cutoff". This leads to NaN-values if there is model data within "cutoff" but only in distances larger than 2 x dlon/dlat. (I found this issue e.g. for FES2014 at Bergen/Norway, 5.3°E 60.4°N)

Instead, I suggest to modify nearest_extrap.py as follows:

    #-- create combined valid mask
    valid_bounds = (~idata.mask) & np.isfinite(idata.data)
    if(np.isfinite(cutoff)):
        #-- reduce to model points within bounds of input points
        # convert cutoff [km] into distance [°] on sphere (simplified)
        lat_max = np.max(np.abs(lat))
        R = 6371
        dlat = np.rad2deg( cutoff/R )
        dlon = np.rad2deg( cutoff/(R*np.cos(np.deg2rad(lat_max))) )
        
        valid_bounds &= (gridlon >= (xmin-dlon))
        valid_bounds &= (gridlon <= (xmax+dlon))
        valid_bounds &= (gridlat >= (ymin-dlat))
        valid_bounds &= (gridlat <= (ymax+dlat))

+ remove the definition of dlon/dlat earlier from the model grid

Great find @DerLude this is totally a bug and I will get it fixed.

Ok, you changed this module quite fundamentally. However, I'm sorry to say that also in the new version, there is a bug.
In both EPSG cases in nearest_extrap.py you want to build the the cKDTree from valid points within bounds (as the comment says #-- find where input grid is valid and close to output points). However, what you do is to build the tree from any points within the bounds, no matter if valid or not. So (for both EPSG cases) the lines
indy,indx = np.nonzero(valid_bounds)
should be
indy,indx = np.nonzero(valid_bounds & valid_mask).
Otherwise, it will just "find" the closest cell, which is likely to be NaN. Do you agree?

Huh. Yeah looking at the code it looks like I meant to do that but didn't include the mask when finding the indices. This may be the root of a problem I've been having when trying to test TPXO8-atlas in CI