/rnoaa

R interface to many NOAA data APIs http://ropensci.org/tutorials/rnoaa_tutorial.html (now with moar crunchy meteo goodness!)

Primary LanguageHTMLOtherNOASSERTION

rnoaa

Build Status Build status codecov.io rstudio mirror downloads cran version

rnoaa is an R interface to many NOAA data sources. We don't cover all of them, but we include many commonly used sources, and add we are always adding new sources. We focus on easy to use interfaces for getting NOAA data, and giving back data in easy to use formats downstream. We currently don't do much in the way of plots or analysis.

Data sources in rnoaa

Help

There is a tutorial on the rOpenSci website, and there are many tutorials in the package itself, available in your R session, or on CRAN. The tutorials:

  • NOAA Buoy vignette
  • NOAA National Climatic Data Center (NCDC) vignette (examples)
  • NOAA NCDC attributes vignette
  • NOAA NCDC workflow vignette
  • Sea ice vignette
  • Severe Weather Data Inventory (SWDI) vignette
  • Historical Observing Metadata Repository (HOMR) vignette
  • Storms (IBTrACS) vignette

netcdf data

Functions to work with buoy data use netcdf files. You'll need the ncdf package for those functions, and those only. ncdf is in Suggests in this package, meaning you only need ncdf if you are using the buoy functions. You'll get an informative error telling you to install ncdf if you don't have it and you try to use the buoy functions. Installation of ncdf should be straightforward on Mac and Windows, but on Linux you may have issues. See http://cran.r-project.org/web/packages/ncdf/INSTALL

NOAA NCDC Datasets

There are many NOAA NCDC datasets. All data sources work, except NEXRAD2 and NEXRAD3, for an unkown reason. This relates to ncdc_*() functions only.

Dataset Description Start date End date
ANNUAL Annual Summaries 1831-02-01 2013-11-01
GHCND Daily Summaries 1763-01-01 2014-03-15
GHCNDMS Monthly Summaries 1763-01-01 2014-01-01
NORMAL_ANN Normals Annual/Seasonal 2010-01-01 2010-01-01
NORMAL_DLY Normals Daily 2010-01-01 2010-12-31
NORMAL_HLY Normals Hourly 2010-01-01 2010-12-31
NORMAL_MLY Normals Monthly 2010-01-01 2010-12-01
PRECIP_15 Precipitation 15 Minute 1970-05-12 2013-03-01
PRECIP_HLY Precipitation Hourly 1900-01-01 2013-03-01
NEXRAD2 Nexrad Level II 1991-06-05 2014-03-14
NEXRAD3 Nexrad Level III 1994-05-20 2014-03-11

NOAA NCDC Attributes

Each NOAA dataset has a different set of attributes that you can potentially get back in your search. See http://www.ncdc.noaa.gov/cdo-web/datasets for detailed info on each dataset. We provide some information on the attributes in this package; see the vignette for attributes to find out more

NCDC Authentication

You'll need an API key to use the NOAA NCDC functions (those starting with ncdc*()) in this package (essentially a password). Go to http://www.ncdc.noaa.gov/cdo-web/token to get one. You can't use this package without an API key.

Once you obtain a key, there are two ways to use it.

a) Pass it inline with each function call (somewhat cumbersome)

ncdc(datasetid = 'PRECIP_HLY', locationid = 'ZIP:28801', datatypeid = 'HPCP', limit = 5, token =  "YOUR_TOKEN")

b) Alternatively, you might find it easier to set this as an option, either by adding this line to the top of a script or somewhere in your .rprofile

options(noaakey = "KEY_EMAILED_TO_YOU")

c) You can always store in permamently in your .Rprofile file.

Installation

GDAL

You'll need GDAL installed first. You may want to use GDAL >= 0.9-1 since that version or later can read TopoJSON format files as well, which aren't required here, but may be useful. Install GDAL:

Then when you install the R package rgdal (rgeos also requires GDAL), you'll most likely need to specify where you're gdal-config file is on your machine, as well as a few other things. I have an OSX Mavericks machine, and this works for me (there's no binary for Mavericks, so install the source version):

install.packages("http://cran.r-project.org/src/contrib/rgdal_0.9-1.tar.gz", repos = NULL, type="source", configure.args = "--with-gdal-config=/Library/Frameworks/GDAL.framework/Versions/1.10/unix/bin/gdal-config --with-proj-include=/Library/Frameworks/PROJ.framework/unix/include --with-proj-lib=/Library/Frameworks/PROJ.framework/unix/lib")

The rest of the installation should be easy. If not, let us know.

Stable version from CRAN

install.packages("rnoaa")

or development version from GitHub

devtools::install_github("ropensci/rnoaa")

Load rnoaa

library('rnoaa')

NCDC v2 API data

Fetch list of city locations in descending order

ncdc_locs(locationcategoryid='CITY', sortfield='name', sortorder='desc')
#> $meta
#> $meta$totalCount
#> [1] 1980
#> 
#> $meta$pageCount
#> [1] 25
#> 
#> $meta$offset
#> [1] 1
#> 
#> 
#> $data
#> Source: local data frame [25 x 5]
#> 
#>       mindate    maxdate             name datacoverage            id
#>         (chr)      (chr)            (chr)        (dbl)         (chr)
#> 1  1892-08-01 2015-11-30       Zwolle, NL       1.0000 CITY:NL000012
#> 2  1901-01-01 2016-01-07       Zurich, SZ       1.0000 CITY:SZ000007
#> 3  1957-07-01 2016-01-07    Zonguldak, TU       0.8632 CITY:TU000057
#> 4  1906-01-01 2016-01-07       Zinder, NG       0.9023 CITY:NG000004
#> 5  1973-01-01 2016-01-16   Ziguinchor, SG       1.0000 CITY:SG000004
#> 6  1938-01-01 2016-01-07    Zhytomyra, UP       0.9722 CITY:UP000025
#> 7  1948-03-01 2016-01-07   Zhezkazgan, KZ       0.9299 CITY:KZ000017
#> 8  1951-01-01 2016-01-06    Zhengzhou, CH       1.0000 CITY:CH000045
#> 9  1941-01-01 2015-11-12     Zaragoza, SP       1.0000 CITY:SP000021
#> 10 1936-01-01 2009-06-17 Zaporiyhzhya, UP       0.9739 CITY:UP000024
#> ..        ...        ...              ...          ...           ...
#> 
#> attr(,"class")
#> [1] "ncdc_locs"

Get info on a station by specifcying a dataset, locationtype, location, and station

ncdc_stations(datasetid='GHCND', locationid='FIPS:12017', stationid='GHCND:USC00084289')
#> $meta
#> NULL
#> 
#> $data
#>   elevation    mindate    maxdate latitude                  name
#> 1      12.2 1899-02-01 2016-01-16  28.8029 INVERNESS 3 SE, FL US
#>   datacoverage                id elevationUnit longitude
#> 1            1 GHCND:USC00084289        METERS  -82.3126
#> 
#> attr(,"class")
#> [1] "ncdc_stations"

Search for data

out <- ncdc(datasetid='NORMAL_DLY', stationid='GHCND:USW00014895', datatypeid='dly-tmax-normal', startdate = '2010-05-01', enddate = '2010-05-10')

See a data.frame

head( out$data )
#> Source: local data frame [6 x 5]
#> 
#>                  date        datatype           station value  fl_c
#>                 (chr)           (chr)             (chr) (int) (chr)
#> 1 2010-05-01T00:00:00 DLY-TMAX-NORMAL GHCND:USW00014895   652     S
#> 2 2010-05-02T00:00:00 DLY-TMAX-NORMAL GHCND:USW00014895   655     S
#> 3 2010-05-03T00:00:00 DLY-TMAX-NORMAL GHCND:USW00014895   658     S
#> 4 2010-05-04T00:00:00 DLY-TMAX-NORMAL GHCND:USW00014895   661     S
#> 5 2010-05-05T00:00:00 DLY-TMAX-NORMAL GHCND:USW00014895   663     S
#> 6 2010-05-06T00:00:00 DLY-TMAX-NORMAL GHCND:USW00014895   666     S

Plot data, super simple, but it's a start

out <- ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', datatypeid='PRCP', startdate = '2010-05-01', enddate = '2010-10-31', limit=500)
ncdc_plot(out, breaks="1 month", dateformat="%d/%m")

plot of chunk unnamed-chunk-12

More plotting

You can pass many outputs from calls to the noaa function in to the ncdc_plot function.

out1 <- ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', datatypeid='PRCP', startdate = '2010-03-01', enddate = '2010-05-31', limit=500)
out2 <- ncdc(datasetid='GHCND', stationid='GHCND:USW00014895', datatypeid='PRCP', startdate = '2010-09-01', enddate = '2010-10-31', limit=500)
ncdc_plot(out1, out2, breaks="45 days")

plot of chunk unnamed-chunk-13

Get table of all datasets

ncdc_datasets()
#> $meta
#> $meta$offset
#> [1] 1
#> 
#> $meta$count
#> [1] 11
#> 
#> $meta$limit
#> [1] 25
#> 
#> 
#> $data
#> Source: local data frame [11 x 6]
#> 
#>                     uid    mindate    maxdate                      name
#>                   (chr)      (chr)      (chr)                     (chr)
#> 1  gov.noaa.ncdc:C00040 1831-02-01 2015-06-01          Annual Summaries
#> 2  gov.noaa.ncdc:C00861 1763-01-01 2016-01-17           Daily Summaries
#> 3  gov.noaa.ncdc:C00841 1763-01-01 2015-12-01         Monthly Summaries
#> 4  gov.noaa.ncdc:C00345 1991-06-05 2016-01-20  Weather Radar (Level II)
#> 5  gov.noaa.ncdc:C00708 1994-05-20 2016-01-17 Weather Radar (Level III)
#> 6  gov.noaa.ncdc:C00821 2010-01-01 2010-01-01   Normals Annual/Seasonal
#> 7  gov.noaa.ncdc:C00823 2010-01-01 2010-12-31             Normals Daily
#> 8  gov.noaa.ncdc:C00824 2010-01-01 2010-12-31            Normals Hourly
#> 9  gov.noaa.ncdc:C00822 2010-01-01 2010-12-01           Normals Monthly
#> 10 gov.noaa.ncdc:C00505 1970-05-12 2014-01-01   Precipitation 15 Minute
#> 11 gov.noaa.ncdc:C00313 1900-01-01 2014-01-01      Precipitation Hourly
#> Variables not shown: datacoverage (dbl), id (chr)
#> 
#> attr(,"class")
#> [1] "ncdc_datasets"

Get data category data and metadata

ncdc_datacats(locationid = 'CITY:US390029')
#> $meta
#> $meta$totalCount
#> [1] 37
#> 
#> $meta$pageCount
#> [1] 25
#> 
#> $meta$offset
#> [1] 1
#> 
#> 
#> $data
#> Source: local data frame [25 x 2]
#> 
#>                     name      id
#>                    (chr)   (chr)
#> 1    Annual Agricultural  ANNAGR
#> 2     Annual Degree Days   ANNDD
#> 3   Annual Precipitation ANNPRCP
#> 4     Annual Temperature ANNTEMP
#> 5    Autumn Agricultural   AUAGR
#> 6     Autumn Degree Days    AUDD
#> 7   Autumn Precipitation  AUPRCP
#> 8     Autumn Temperature  AUTEMP
#> 9               Computed    COMP
#> 10 Computed Agricultural COMPAGR
#> ..                   ...     ...
#> 
#> attr(,"class")
#> [1] "ncdc_datacats"

Tornado data

The function tornadoes() simply gets all the data. So the call takes a while, but once done, is fun to play with.

shp <- tornadoes()
#> OGR data source with driver: ESRI Shapefile 
#> Source: "/Users/sacmac/.rnoaa/tornadoes/tornadoes", layer: "tornado"
#> with 57988 features and 21 fields
#> Feature type: wkbLineString with 2 dimensions
library('sp')
plot(shp)

plot of chunk unnamed-chunk-16

HOMR metadata

In this example, search for metadata for a single station ID

homr(qid = 'COOP:046742')
#> $`20002078`
#> $`20002078`$id
#> [1] "20002078"
#> 
#> $`20002078`$head
#>                  preferredName latitude_dec longitude_dec precision
#> 1 PASO ROBLES MUNICIPAL AP, CA      35.6697     -120.6283    DDMMSS
#>             por.beginDate por.endDate
#> 1 1949-10-05T00:00:00.000     Present
#> 
#> $`20002078`$namez
#> Source: local data frame [3 x 2]
#> 
#>                         name  nameType
#>                        (chr)     (chr)
#> 1   PASO ROBLES MUNICIPAL AP      COOP
#> 2   PASO ROBLES MUNICIPAL AP PRINCIPAL
#> 3 PASO ROBLES MUNICIPAL ARPT       PUB
#> 
#> $`20002078`$identifiers
...

Storm data

Get storm data for the year 2010

storm_data(year = 2010)
#> <NOAA Storm Data>
#> Size: 2855 X 195
#> 
#>       serial_num season num basin sub_basin name            iso_time
#> 1  2009317S10073   2010   1    SI        MM ANJA 2009-11-13 06:00:00
#> 2  2009317S10073   2010   1    SI        MM ANJA 2009-11-13 12:00:00
#> 3  2009317S10073   2010   1    SI        MM ANJA 2009-11-13 18:00:00
#> 4  2009317S10073   2010   1    SI        MM ANJA 2009-11-14 00:00:00
#> 5  2009317S10073   2010   1    SI        MM ANJA 2009-11-14 06:00:00
#> 6  2009317S10073   2010   1    SI        MM ANJA 2009-11-14 12:00:00
#> 7  2009317S10073   2010   1    SI        MM ANJA 2009-11-14 18:00:00
#> 8  2009317S10073   2010   1    SI        MM ANJA 2009-11-15 00:00:00
#> 9  2009317S10073   2010   1    SI        MM ANJA 2009-11-15 06:00:00
#> 10 2009317S10073   2010   1    SI        MM ANJA 2009-11-15 12:00:00
#> ..           ...    ... ...   ...       ...  ...                 ...
#> Variables not shown: nature (chr), latitude (dbl), longitude (dbl),
#>      wind.wmo. (dbl), pres.wmo. (dbl), center (chr), wind.wmo..percentile
#>      (dbl), pres.wmo..percentile (dbl), track_type (chr),
#>      latitude_for_mapping (dbl), longitude_for_mapping (dbl),
#>      current.basin (chr), hurdat_atl_lat (dbl), hurdat_atl_lon (dbl),
...

GEFS data

Get forecast for a certain variable.

res <- gefs("Total_precipitation_surface_6_Hour_Accumulation_ens", lat = 46.28125, lon = -116.2188)
head(res$data)
#>   lon lat ens time2 Total_precipitation_surface_6_Hour_Accumulation_ens
#> 1 244  46   0     6                                                   0
#> 2 244  46   1    12                                                   0
#> 3 244  46   2    18                                                   0
#> 4 244  46   3    24                                                   0
#> 5 244  46   4    30                                                   0
#> 6 244  46   5    36                                                   0

Argo buoys data

There are a suite of functions for Argo data, a few egs:

# Spatial search - by bounding box
argo_search("coord", box = c(-40, 35, 3, 2))

# Time based search
argo_search("coord", yearmin = 2007, yearmax = 2009)

# Data quality based search
argo_search("coord", pres_qc = "A", temp_qc = "A")

# Search on partial float id number
argo_qwmo(qwmo = 49)

# Get data
argo(dac = "meds", id = 4900881, cycle = 127, dtype = "D")

CO-OPS data

Get daily mean water level data at Fairport, OH (9063053)

coops_search(station_name = 9063053, begin_date = 20150927, end_date = 20150928,
             product = "daily_mean", datum = "stnd", time_zone = "lst")
#> $metadata
#> $metadata$id
#> [1] "9063053"
#> 
#> $metadata$name
#> [1] "Fairport"
#> 
#> $metadata$lat
#> [1] "41.7598"
#> 
#> $metadata$lon
#> [1] "-81.2811"
#> 
#> 
#> $data
#>            t       v   f
#> 1 2015-09-27 174.480 0,0
#> 2 2015-09-28 174.472 0,0

Meta

  • Please report any issues or bugs.
  • License: MIT
  • Get citation information for rnoaa in R doing citation(package = 'rnoaa')
  • Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

rofooter