ec-jrc/Thalassa

Visualizing Global Meshes

pmav99 opened this issue · 5 comments

pmav99 commented

This is something that came up after investigating why the STOFS 2D Global model could not be visualized (#54). STOFS is an ADCIRC model, but the same problem should exist for SCHISM models, too, therefore I think that we should have a dedicated issue.

So, visualizing the STOFS model gives this output:

image

The problem is that there are elements crossing over the international time line and this messes up the visualization. The white line at ~65 latitude is because at that particular range the international line crosses the region North of Kamtchatka (not sure of the actual name).

This problem is something that we have encountered before while trying to visualize global meshes and pyposeidon does provide a fix for this and the docs do explain in a bit more detail. Applying the fix though takes significant time. So you need to do it a post-processing step. You can't do it dynamically in thalassa. At least not for these 12 million nodes files.

@brey can provide more details if necessary.

pinging @saeed-moghimi-noaa

brey commented

FYI, the tool that pyposeidon offers is described in the docs and in the Tutorial. Note however that the script is SCHISM specific and it needs to be generalised/modified.

pmav99 commented

A pure-python implementation for dropping the elements that cross the IDL (i.e. Joseph's suggestion) is the following:

def drop_elements_crossing_idl(ds: xr.Dataset, max_lon: float = 100) -> xr.Dataset:
    """
    Drop triface elements crossing the International Date Line (IDL).
    
    ``max_lon`` is the maximum longitudinal "distance" in degrees for an element.
    """
    if max_lon <= 0:
        raise ValueError(f"Maximum longitudinal \"distance\" must be positive: {max_lon}")
    a, b, c = ds.triface_nodes.T
    indices_of_triface_nodes_crossing_idl = np.where(
          (np.abs(ds.lon[a] - ds.lon[b]) > max_lon)
        | (np.abs(ds.lon[a] - ds.lon[c]) > max_lon)
        | (np.abs(ds.lon[b] - ds.lon[c]) > max_lon)
    )[0]
    ds = ds.drop_isel(triface=indices_of_triface_nodes_crossing_idl)
    return ds

It can be used like this:

from thalassa import api

ds  = api.open_dataset("/path/to/output.nc")
ds = drop_elements_corssing_idl(ds=ds)

# use the normal api functions ...

We could add this functionality in thalassa but is relatively slow for big meshes (30+ seconds on my laptop). I don't think it is something that we want to be using interactively. IMHV this belongs to the post-processing steps of a model.

pmav99 commented

For the record, this is a faster, but also more memory intensive implementation (my laptop run out of memory when I run this on the STOFS output):

def drop_elements_crossing_idl2(ds: xr.Dataset, max_lon: float = 100) -> xr.Dataset:
    """
    Drop triface elements crossing the International Date Line (IDL).
    
    ``max_lon`` is the maximum longitudinal "distance" in degrees for an element.
    """
    if max_lon <= 0:
        raise ValueError(f"Maximum longitudinal \"distance\" must be positive: {max_lon}")
    indices_of_triface_nodes_crossing_idl = np.where(
        np.abs(ds.lon[ds.triface_nodes] - ds.lon[ds.triface_nodes.roll(three=1)]) > max_lon
    )[0]
    ds = ds.drop_isel(triface=indices_of_triface_nodes_crossing_idl)
    return ds
pmav99 commented

In the end, I added in utils.py a function that drops elements that cross the IDL:

Thalassa/thalassa/utils.py

Lines 119 to 159 in 3dcc1b4

def drop_elements_crossing_idl(
ds: xr.Dataset,
max_lon: float = 10,
) -> xr.Dataset:
"""
Drop triface elements crossing the International Date Line (IDL).
``max_lon`` is the maximum longitudinal "distance" in degrees for an element.
What we are actually trying to do in this function is to identify mesh triangles that cross
the IDL. The truth is that when you have a triplet of nodes you can't really know if the
tirangle they create is the one in `[-180, 180]` or the one that crosses the IDL.
So here we make one assumption: That we are dealing with a global mesh with a lot of elements.
Therefore we assume that the elements that cross the IDL are the ones that:
1. have 2 nodes with different longitudinal sign, e.g. -179 and 179
2. the absolute of the difference of the longitudes is greater than a user defined threshold
e.g. `|179 - (-179)| >= threshold`
The second rule exists to remove false positives close to Greenwich (e.g. 0.1 and -0.1)
These rules can lead to false positives close to the poles (e.g. latitudes > 89) especially
if a small value for `max_lon` is used. Nevertheless, the main purpose of this function is
to visualize data, so some false positives are not the end of the wold.
"""
if max_lon <= 0:
raise ValueError(f'Maximum longitudinal "distance" must be positive: {max_lon}')
a, b, c = ds.triface_nodes.data.T
lon = ds.lon.data
lon_a = lon[a]
lon_b = lon[b]
lon_c = lon[c]
# `np.asarray(condition).nonzero()` is equivalent to `np.where(condition)`
# For more info check the help of `np.where()`
condition = (
# fmt: off
((lon_a * lon_b < 0) & (np.abs(lon_a - lon_b) >= max_lon))
| ((lon_a * lon_c < 0) & (np.abs(lon_a - lon_c) >= max_lon))
| ((lon_b * lon_c < 0) & (np.abs(lon_b - lon_c) >= max_lon))
# fmt: on
)
indices_of_triface_nodes_crossing_idl = np.asarray(condition).nonzero()[0]
ds = ds.drop_isel(triface=indices_of_triface_nodes_crossing_idl)
return ds

With the current implementation there are false positives, i.e. close to the poles, even elements that don't cross the IDL may be dropped. Nevertheless, it is quite fast and it should be good enough for a quick check of the results.

Closing