google-deepmind/graphcast

Testing GenCast with Operational Data

Closed this issue · 2 comments

Hey Guys,

How do I go about using or formatting current ERA5 data (Dec 7th, 2024) from CDS to test GenCast?

Notice that the current pressure level ERA5 data have less variables than the example dataset in the Google bucket.

Hey,

You might find https://github.com/google-research/arco-era5 useful.

The repo contains scripts on converting CDS downloaded data to zarr format and the associated cloud buckets are kept quite up to date.

Notice that the current pressure level ERA5 data have less variables than the example dataset in the Google bucket.

I'm not sure I follow what you mean here, could you elaborate?

Best,

Andrew

Hi Andrew,

I think I see my misunderstanding, For example, If I want run a 15 day forecast from Dec 7th 2024 using the pre-trained model. I would need to download both reanalysis from CDS's ERA5 hourly data on single levels from 1940 to present and ERA5 hourly data on pressure levels from 1940 to present and merge them together for the period I want to start from.

This would explain why on the google buckets netcdf I am seeing e.g 2m_temperature but not seeing it in the ERA5 hourly data on pressure levels which I initially attempted download