Variable names and weights as part of static dataset files

Question

Closed this issue 4 months ago · 1 comments

Currently the variables in the dataset are listed in constants.py. This is bad if the code is to be used with other datasets.

Create a file variables.json in data/my_dataset/static that describe all variables. This includes:

Weather state variables (e.g. u_65)
Forcing variables for the full grid
Batch-static forcing variables (static during one forecast, but changing throughout the dataset. i.e. open water currently)

All of these should be listed in order with names. For the weather state variables, their weighting (as in parameter_weights.npy currently) should also be listed with them. We can then remove the lines https://github.com/joeloskarsson/neural-lam/blob/89a4c63370201c9ea1a5f04d4cf1e5e75b7cc83e/create_parameter_weights.py#L26-L31 that generate this weighting file. It is better to let this be something that is set manually when preparing a dataset.

Such a variables.json file could then be loaded into a VariableDescription object and used in the models. The variable dimensions https://github.com/joeloskarsson/neural-lam/blob/89a4c63370201c9ea1a5f04d4cf1e5e75b7cc83e/neural_lam/models/ar_model.py#L22-L24 should then be read from this object rather than hard-coded in a model definition.

Answer 1 · 2024-05-03T08:32:41.000Z

Superseded by #23