graphcast missing in dataset

Question

graphcast missing in dataset

Closed this issue a year ago · 8 comments

akhtarvision commented a year ago

Hi. Thanks for the great initiative

I see the Graphcast dataset folder is missing in the bucket "weatherbench2/datasets"? @shoyer @raspstephan

Answer 1 · 2024-01-02T12:35:15.000Z

Hey, we currently don't have GraphCast forecasts o the public cloud bucket. Hopefully this will change soon.

Answer 2 · 2024-01-02T12:49:35.000Z

Okay, thanks for the information. I tried to get predictions from graphcast model but it seems that there are some key mismatches, that I have been trying to resolve. If you have any resource that makes the exact format the weather bench requires, do let me know. Thanks

Answer 3 · 2024-01-05T13:52:28.000Z

Could you be more specific about those mismatches? Maybe I can help then.

Answer 4 · 2024-01-09T09:38:28.000Z

KeyError: "'init_time' is not a valid dimension or coordinate"

predictions I got did not have time but had time deltas. So I had to insert one coordinate to keep the evaluation running, but it popped out after inserting.

Complete log:

Traceback (most recent call last):
File "////miniconda3/envs/weatherv2/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "////miniconda3/envs/weatherv2/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "////.vscode/extensions/ms-python.python-2022.16.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/main.py", line 39, in
cli.main()
File "////.vscode/extensions/ms-python.python-2022.16.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
run()
File "////.vscode/extensions/ms-python.python-2022.16.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
runpy.run_path(target, run_name="main")
File "////.vscode/extensions/ms-python.python-2022.16.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path
return _run_module_code(code, init_globals, run_name,
File "////.vscode/extensions/ms-python.python-2022.16.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "////.vscode/extensions/ms-python.python-2022.16.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
exec(code, run_globals)
File "////fcfa0911-ffd4-49d3-9003-efe40cfa8f8b/wbench/weatherbench_eval.py", line 312, in
evaluate_in_memory(data_config, eval_configs) # Takes around 5 minutes
File "////miniconda3/envs/weatherv2/lib/python3.10/site-packages/weatherbench2/evaluation.py", line 496, in evaluate_in_memory
_evaluate_all_metrics(eval_name, eval_config, data_config)
File "////miniconda3/envs/weatherv2/lib/python3.10/site-packages/weatherbench2/evaluation.py", line 430, in _evaluate_all_metrics
forecast, truth, climatology = open_forecast_and_truth_datasets(
File "////miniconda3/envs/weatherv2/lib/python3.10/site-packages/weatherbench2/evaluation.py", line 328, in open_forecast_and_truth_datasets
forecast = _impose_data_selection(
File "////miniconda3/envs/weatherv2/lib/python3.10/site-packages/weatherbench2/evaluation.py", line 152, in _impose_data_selection
dataset = dataset.sel({time_dim: selection.time_slice})
File "////miniconda3/envs/weatherv2/lib/python3.10/site-packages/xarray/core/dataset.py", line 2794, in sel
query_results = map_index_queries(
File "////miniconda3/envs/weatherv2/lib/python3.10/site-packages/xarray/core/indexing.py", line 186, in map_index_queries
grouped_indexers = group_indexers_by_index(obj, indexers, options)
File "////miniconda3/envs/weatherv2/lib/python3.10/site-packages/xarray/core/indexing.py", line 150, in group_indexers_by_index
raise KeyError(f"{key!r} is not a valid dimension or coordinate")
KeyError: "'init_time' is not a valid dimension or coordinate"

Answer 5 · 2024-01-11T20:25:25.000Z

Can you give some pointers for getting predictions from Graphcast or any other weather prediction model and converting them into the exact format required for weatherbenchv2? A quick reply will be great. @raspstephan

Answer 6 · 2024-01-12T08:17:05.000Z

Can you share what the dataset looks like and what command you ran that gave you the error?

Answer 7 · 2024-01-12T18:52:35.000Z

And this command gives error: "evaluate_in_memory(data_config, eval_configs) # Takes around 5 minutes"

An exact format seems to be an issue like you have for all other methods in the bucket.
The evaluation followed by the required format would solve the issues. I have seen that you have some results (in results folder) on Graphcast that are not flexible to particular regions of selection like the selection config in the weatherbench code.

Answer 8 · 2024-01-15T09:08:29.000Z

Hey, your forecast is missing a dimension. The dataset only has a timedelta but it doesn't actually specify when the forecast was initialized. Check the datasets here https://weatherbench2.readthedocs.io/en/latest/data-guide.html to see what the standard format for forecast files it.

Regarding out of memory: Depending on how much RAM you have, evaluation can be hard to do in memory. In this case, you probably want to look at distributed evaluation: https://weatherbench2.readthedocs.io/en/latest/beam-in-the-cloud.html or choose a smaller time slice.