mllam/neural-lam

Adapt graph generation script to general limited areas

Opened this issue · 6 comments

The graph generation script create_mesh.py is currently written for the MEPS area. Without huge changes this should be possible to change to a generic script that can work for general (quadratic in their projection) areas. Some more arguments would probably have to be introduced.

It might be a good idea to turn the graph generation into a function, and let create_mesh.py only be a utility script for calling this function with some command line arguments.

The area definition object proposed in #2 could be useful as input to such a generic script.

I am trying out LAM for an academic project on one specific state of India. Looks like creating a mesh for the state will require me to provide a static nwp_xy.npy. Could you please help me out with how to generate this file for the area that I am concerned with? Does it contain Lambert projection? The values in the example nwp_xy.npy file clearly do not look like lat/lon coordinates

Sounds like a nice use case! Indeed, the coordinates in nwp_xy.npy are not lat/lon, but coordinates in the Lambert projection. The example file has coordinates corresponding to the projection currently specified at https://github.com/joeloskarsson/neural-lam/blob/e5f1ad3ed0dff308eb4df720f9846d0ad61bd5fc/neural_lam/constants.py#L75-L83. So you would want to change this one to the projection you are using for your area.

nwp_xy.npy should then be a saved numpy array of shape (2, N_x, N_y), for an area with N_x $\times$ N_y grid cells/nodes. Entry nwp_xy[0,i,j] should then be the x-coordinate of grid node (i,j) in the coordinate system of your projection, and entry nwp_xy[1,i,j] corresponding y-coordinate.

Hi, I am absolutely new to geographical projections. Could you direct me to resources to understand which projection I should use for epsg:7779?

(Small caveat: I am probably not the right person to advice you on this, as I am far from an expert in map projections or GIS. You should probably ask someone who knows these kinds of details about your dataset.) I guess you could just feed your EPSG code to cartopy: https://scitools.org.uk/cartopy/docs/latest/reference/generated/cartopy.crs.epsg.html#cartopy.crs.epsg and get the right projection. A quick google gave me this: https://epsg.io/7779 which seems to indicate that it's a transverse mercator projection. Note however that the projection specified in constants.py is used only for plotting. The actual coordinate values to put in nwp_xy.npy are something that you need to get together with your data. These should correspond to where your data is actually located in the projection.

Hi @nandini-21, I had to create this file too when trying Neural-LAM on my data. The steps to feed your data into Neural-LAM would be:

  1. Edit the constants.py with the description of the projection for your data.
  2. Create the data/static/nwp_xy.npy file with the coordinates of your grid in this projection.

For my part, step 1 was made easy because I also use a Lambert projection, so I just had to edit the variable lambert_proj_params. If you are using another CRS, please make sure it is correctly set in the lambert_proj variable.

For the step 2, here is the snippet of code I used (inspired from this) to convert my grid coordinates from regular lon-lat (EPSG:4326) to the appropriate Lambert projection coordinates. It relies on easydict and pyproj that are not listed in the dependencies.

import easydict
from pyproj import Transformer
from neural_lam import constants

lon = [ ... 2d-array with longitude for all my grid ... ] # shape = (n_lon, n_lat)
lat = [ ... 2d-array with latitude for all my grid ... ] # shape = (n_lon, n_lat)

pp = easydict.EasyDict(constants.lambert_proj_params)
mycrs = f"+proj=lcc +lat_0={pp.lat_0} +lon_0={pp.lon_0} +lat_1={pp.lat_1} +lat_2={pp.lat_2} +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs"

crstrans = Transformer.from_crs("EPSG:4326", mycrs, always_xy=True)
x, y = crstrans.transform(lon, lat)

xy = np.array([x, y]) # shape = (2, n_lon, n_lat)
np.save("nwp_xy.npy", xy)

For the g2m and m2g we could use vector quantization (initial guess with known topology) or Kohonen SOM as suggested by @leifdenby
Below an example where I used scipy.cluster.vq.kmeans2 to distribute 15x15 mesh nodes in g2m.
g2m
Looking at this it seems as if the solution is kind of hexagonal which points me towards the idea we had about a hexagonal grid. I'll try to come up some first attempt at this to start the discussion.
Hm, din't work out the way I wanted. The topology seems to be broken by the clustering. Maybe we need SOM...

Starting out with this regular mesh node distribution and topolgy
g2m_init
I ended up with this:
g2m_result