Build Status

sensor-web-harvester

New Custom Network/Offerings read more at the bottom

sensor-web-harvester is a Scala project that harvests sensor data from web sources. The data is then pushed to an SOS using the sos-injection module project. SosInjector is a project that wraps an Sensor Observation Service (SOS). The sos-injection module provides Java classes to enter stations, sensors, and observations into an SOS.

sensor-web-harvester is used to fill an SOS with observations from many well-known sensor sources (such as NOAA and NERRS). This project pulls sensor observation values from the source’s stations. It then formats the data to be placed into the user’s SOS by using the sos-injector. The source stations used are filtered by a chosen bounding box area.

The current sources that observations are pulled from are:

This project uses an H2 metadata database to store station information from the sources. The metadata database information is used to retrieve observations from the stations' sources. The database is file based and autogenerated if it doesn't already exist.

This project can be executed by running the pre-built jar with the command line (see "Running the SOS Injector") or by writing custom Java code (see "Writing Custom Java Code").

Installation

This project can be used on either a Windows or Linux computer. An Apple computer is expected to work however it has not been tested.

The following are the requirements to run this project:

Configuring sensor-web-harvester

The pre-built sensor-web-harvester.jar and example_sos.properties can be downloaded from the GitHub releases page.

The command line takes in a properties file which contains all of the needed variables to perform an SOS update. The properties file requires the following variables:

# The URL where the H2 metadata database should be stored
# Note: if running in Docker, use jdbc:h2:/srv/swhdb/db
database_url = jdbc:h2:/usr/local/sensor_web_harvester

# The URL to the SOS being used.
sos_url = http://localhost:8080/i52n-sos

# The publisher's country
publisher_country = USA

# The publisher's email address
publisher_email = publisher@example.com

# The web address of the publisher
publisher_web_address = http://example.org

# The name of the publishing organization
publisher_name = RA

# The northernmost latitude of the bounding box
north_lat = 50.0

# The southernmost latitude of the bounding box
south_lat = 40.0

# The westernmost longitude of the bounding box
west_lon = -93.0

# The easternmost longitude of the bounding box
east_lon = -75.0

# The network root for the default network in the SOS that contains all the stations. This network is different for each SOS. For example for AOOS the defaut network is urn:ioos:network:aoos:all.  The "network_root_id" is "all" and the "network_root_source_id" is "aoos".
network_root_id = all
network_root_source_id = aoos

# semi-colon seperated value list of sources that are to be updated (optional: this will default to 'all' if it is not the properties file)
# sources = all - this will operate on all known sources
# sources = nerrs;storet;glos - this will operate on the nerrs, storet and glos sources
# Accepted values: all, glos, hads, ndbc, nerrs, noaa_nos_coops, noaaweather, raws, snotel, storet, usgswater
sources = all

An example of a properties file named example_sos.properties is also provided on the Github releases page.

Running with Docker


See below for explanations of the metadata, updatesos, and writeiso modes.

The following is an example of a Docker execution (Docker 1.9.0+ required). You must create your sos.properties config file first. You do not have to install any dependencies other than Docker.

docker run --rm -v swhdb:/srv/swhdb -v $(pwd)/sos.properties:/tmp/sos.properties \
  ioos/sensor-web-harvester -updatesos /tmp/sos.properties

Running sensor-web-harvester


Note: Running these processes can take a long time (hours) as information is downloaded and extracted from many sources.

The sensor-web-harvester has three modes:

metadata

This mode harvests from the source(s) defined in the properties file and updates the metadata database. This command should be run conservatively (approx. 3 times a week) since the sources’ stations do not change often and this command is taxing on the sources’ servers.

java -jar sensor-web-harvester.jar -metadata [path to properties file]

writeiso

This mode writes ISO 19115-2 metadata files based on data in the metadata database

java -jar sensor-web-harvester.jar -writeiso [path to properties file]

updatesos

This mode downloads data from the sources and injects the data into the 52N instance specified in the properties file. Do not call this command more than once hourly (for reasons previously stated).

Example:

java -jar sensor-web-harvester.jar -updatesos [path to properties file]

Writing Custom Java Code

This is example code demonstrating how to update the metadata database and the SOS from within custom Java code.

// Southern California Bounding Box
Location southWestCorner = new Location(32.0, -123.0);
Location northEastCorner = new Location(35.0, -113.0);
BoundingBox boundingBox = new BoundingBox(southWestCorner, northEastCorner);

String databaseUrl = "jdbc:postgresql://localhost:5432/sensor";
String databaseUsername = "sensoruser";
String databasePassword = "sensor";
String sosUrl = "http://localhost:8080/sos/sos";

MetadataDatabaseManager metadataManager = new MetadataDatabaseManager(databaseUrl, 
  databaseUsername, databasePassword, boundingBox)

// Updates the local metadata database with station information
// This call should be made conservatively (approx. 3 times a week) since the 
// sources’ stations do not change often and this call is taxing on the sources’ servers.
metadataManager.update();

// Information about the group publishing this data on the SOS. 
PublisherInfoImp publisherInfo = new PublisherInfoImp();
publisherInfo.setCountry("USA");
publisherInfo.setEmail("publisher@domain.com");
publisherInfo.setName("IOOS");
publisherInfo.setWebAddress("http://www.ioos.gov/");

SosNetworkImp rootNetwork = new SosNetworkImp()
rootNetwork.setId("all")
rootNetwork.setSourceId("aoos")

SosSourcesManager sosManager = new SosSourceManager(databaseUrl, 
  databaseUsername, databasePassword, sosUrl, publisherInfo, rootNetwork);
  
// Updates the SOS with data pulled from the source sites. 
// This uses the metadata database
// Most of the data is hourly. The data should be pulled conservatively (approx. hourly) 
// since the observations do not change often and this action is taxing on the sources’ servers.
sosManager.updateSos();

Create Custom Networks for Sources or Stations

To create custom networks/offerings one must adjust three tables (network, network_source, network_station) in the metadata database. All new networks need to be created and associated to source or stations before they are submitted to the SOS. Meaning that if a station is already created on the SOS it cannot later be associated to a network.

First step, each custom network needs be added to the network table. The tag and source_tag columns are the main columns that need filled in for a new network. The tag and source_tag are combined (urn:ioos:network:[source_tag]:[tag]) to create the id of the network in the SOS.

These custom networks can be assoicated to all stations of a source with the use of the network_source table. To associate a network with a source, place the source's database id and the network's database id in a row.

These custom networks can be associated to specific stations from the network_station table. A row needs to be created for each station that a network is assoicated to. In each of these rows add the network id and the station id to be associated.

List of Sources URLs

HADS

Pull stations Information

Observation Retrieval

  1. state = nil
  2. hsa = nil
  3. of = 1
  4. nesdis_ids = [station id]
  5. sinceday = [number of days of observations requested]

NDBC

This source has an SOS service that is used to pull station information and observation data.

http://opendap.co-ops.nos.noaa.gov/ioos-dif-sos/SOS

NOAA NOS CO-OPS

This source has an SOS service that is used to pull station information and observation data.

http://opendap.co-ops.nos.noaa.gov/ioos-dif-sos/SOS

NOAA Weather

Pull stations Information

Observation Retrieval

RAWS

Pull stations Information

  1. http://www.raws.dri.edu/aklst.html
  2. http://www.raws.dri.edu/azlst.html
  3. http://www.raws.dri.edu/ncalst.html
  4. http://www.raws.dri.edu/ccalst.html
  5. http://www.raws.dri.edu/scalst.html
  6. http://www.raws.dri.edu/colst.html
  7. http://www.raws.dri.edu/hilst.html
  8. http://www.raws.dri.edu/nidwmtlst.html
  9. http://www.raws.dri.edu/sidlst.html
  10. http://www.raws.dri.edu/emtlst.html
  11. http://www.raws.dri.edu/nidwmtlst.html
  12. http://www.raws.dri.edu/nvlst.html
  13. http://www.raws.dri.edu/nmlst.html
  14. http://www.raws.dri.edu/orlst.html
  15. http://www.raws.dri.edu/utlst.html
  16. http://www.raws.dri.edu/walst.html
  17. http://www.raws.dri.edu/wylst.html
  18. http://www.raws.dri.edu/illst.html
  19. http://www.raws.dri.edu/inlst.html
  20. http://www.raws.dri.edu/ialst.html
  21. http://www.raws.dri.edu/kslst.html
  22. http://www.raws.dri.edu/ky_tnlst.html
  23. http://www.raws.dri.edu/mi_wilst.html
  24. http://www.raws.dri.edu/mnlst.html
  25. http://www.raws.dri.edu/molst.html
  26. http://www.raws.dri.edu/nelst.html
  27. http://www.raws.dri.edu/ndlst.html
  28. http://www.raws.dri.edu/ohlst.html
  29. http://www.raws.dri.edu/sdlst.html
  30. http://www.raws.dri.edu/mi_wilst.html
  31. http://www.raws.dri.edu/al_mslst.html
  32. http://www.raws.dri.edu/arlst.html
  33. http://www.raws.dri.edu/fllst.html
  34. http://www.raws.dri.edu/ga_sclst.html
  35. http://www.raws.dri.edu/lalst.html
  36. http://www.raws.dri.edu/nclst.html
  37. http://www.raws.dri.edu/oklst.html
  38. http://www.raws.dri.edu/txlst.html
  39. http://www.raws.dri.edu/prlst.html
  40. http://www.raws.dri.edu/ct_ma_rilst.html
  41. http://www.raws.dri.edu/de_mdlst.html
  42. http://www.raws.dri.edu/me_nh_vtlst.html
  43. http://www.raws.dri.edu/nj_palst.html
  44. http://www.raws.dri.edu/nylst.html
  45. http://www.raws.dri.edu/va_wvlst.html

Observation Retrieval

  1. stn = [station id]
  2. smon = [start month - two digit Integer]
  3. sday = [start day of month - two digit Integer]
  4. syea = [start year - two digit Integer]
  5. emon = [end month - two digit Integer]
  6. eday = [end day of month - two digit Integer]
  7. eyea = [end year - two digit Integer]
  8. dfor = 02
  9. srce = W
  10. miss = 03
  11. flag = N
  12. Dfmt = 02
  13. Tfmt = 01
  14. Head = 02
  15. Deli = 01
  16. unit = M
  17. WsMon = 01
  18. WsDay = 01
  19. WeMon = 12
  20. WeDay = 31
  21. WsHou = 00
  22. WeHou = 24

SnoTel

Pull stations Information

  1. sitenum = [station id]

Observation Retrieval

  1. time_zone = PST
  2. sitenum = [station id]
  3. timeseries = Hourly
  4. interval = WEEK
  5. format = copy
  6. report = ALL

USGS Water

Pull stations Information

  • http://waterservices.usgs.gov/nwis/iv?stateCd=[state tag]&period=PT4H
  • state tags = "al", "ak", "aq", "az", "ar", "ca", "co", "ct", "de", "dc", "fl", "ga", "gu", "hi", "id", "il", "in", "ia", "ks", "ky", "la", "me", "md", "ma", "mi", "mn", "ms", "mo", "mt", "ne", "nv", "nh", "nj", "nm", "ny", "nc", "nd", "mp", "oh", "ok", "or", "pa", "pr", "ri", "sc", "sd", "tn", "tx", "ut", "vt", "vi", "va", "wa", "wv", "wi", "wy"

Observation Retrieval

NERRS

This source has a webservice end point at http://cdmo.baruch.sc.edu/webservices2/requests.cfc?wsdl that is used to pull the station information and the observations for each station.

A java jar was create to work with this webservice in java. Below shows references needed to use this jar in Maven or it can be downloaded at http://nexus.axiomalaska.com/nexus/content/repositories/public/com/axiomalaska/nerrs_webservice/1.0.0/nerrs_webservice-1.0.0.jar

Maven Dependency

<repository>
  <id>axiom_public_releases</id>
  <name>Axiom Releases</name>
  <url>http://nexus.axiomalaska.com/nexus/content/repositories/public/</url>
</repository>
<dependency>
  <groupId>com.axiomalaska</groupId>
  <artifactId>nerrs_webservice</artifactId>
  <version>1.0.0</version>
</dependency>