There is now a probably easier way to download this kind of data using xarray
. There are examples downloading and ploting variables in the folder notebook
. There is also two new example scripts, get_gfs_xarray.py
and get_gfs_hist_xarray.py
, that download data from the real-time and historical server (see this comment for more information) using xarray. Thanks to @heyerbobby for the first version of the xarray scripts.
If you are looking to download only from the real time server, the repository https://github.com/jagoosw/getgfs contains a more polished and user-friendly version and you should probably use that instead.
These scripts were tested with Python 3.9, but they should work with any Python 3 version. First install Anaconda and then create an enviroment with
conda env create -f environment.yml
Then activate the environment
conda activate get-gfs
Scripts to fetch meteorological data from the GFS model:
get_gfs.py
gets data from the real-time server, which is located at https://nomads.ncep.noaa.gov/dods/ and holds the last 15 days of data.get_gfs_hist.py
gets data from the historical server, which is located at https://www.ncei.noaa.gov/thredds/catalog/model-gfs-004-files-old/catalog.html and holds the last 2 years of data (more information: https://www.ncdc.noaa.gov/data-access/model-data/model-datasets/global-forcast-system-gfs)
Example for the real time server:
./get_gfs.py -s 1 -r 0.25 -t 0 48 -x -10 10 -y -15 15 -p 0 2 -c example_conf.json 20210217 00
The previous line will download meteorology from the GFS run on 2021-02-17 at 00z:
- Temporal resolution of 1 hour
- Spatial resolution of 0.25º
- Time steps from 0 to 48 (since temporal resolution is 1h, 48 hours from 20210217 at 00)
- Longitudes from -10 to 10
- Latitudes from -15 to 15
- Pressure levels from 0 to 2 (only for variables that have pressure level data)
- Variables in
example_conf.json
Example for the historical server:
./get_gfs_hist.py -t 0 10 -x -10 10 -y -10 10 -c example_conf_hist.json 20191005 00
Note that the historical server:
- Only has 0.5º spatial resolution (the default)
- Only has 3h temporal resolution (the default)
- It downloads the first 10 time steps, which in turn it translates to hours 00-30 (due to temporal resolution of 3 hours)
- Pressure levels and heights are specified for each variable in the configuration file
To build the JSON configuration files for the historical server you can go directly to the server and check the following URL for any day:
The possible values for the height_above_ground
and isobaric
levels can be
obtained running a query directly in the browser, for instance:
Similarly, for the real time server you can get this information by adding the suffix .dds
, .info
and .das
In the URLs you can also see some information about the meteorological variables such us units, minimum, maximum, representation of missing values and so on.
The output of the script is an Pandas dataframe written to an ASCII file, with a
multi-index in the rows (lat, lon) and a multi-index in the columns
(variables-time). It can be read back into Python using pd.read_csv()
.
Apart from the name of the variables, which is different in both servers (even though they refer to the same meteorological variable), there are also other differences between them:
- The real time server stores all the time steps in one file, while in the historical server there is one file for each time step (0, 3, 6, 9, 12,...)
- The real time server has also 0.25º spatial resolution, while the historical server only has 0.5º
- The real time server has a temporal resolution of 1hr and 3hr for 0.25º, while for 0.5º and in the historical server only an step of 3hr is available
- In the real time server the different heights of the variables are stored
in different entries. For instance
tmp2m
,tmp80m
,tmp100m
refer to the temperature at 2, 80 and 100m above ground. In the historical server these variables are a stored in an new dimension of the variable, for exampleTemperature_height_above_ground
. Thus, in the historical server the z-axis (eitherheight_above_ground
orpressure
) has to be set for each variable in the configuration file. In the real time server the pressure levels are controlled using an optional parameter, but they have to be the same for every variable which has them. Variables at different heights are different entries, as mentioned above.