Exploring the Use of the Semantic Web for discovering and processing data from Sensor Observation Services

Master thesis by Ivo de Liefde (student MSc. Geomatics at Delft University of Technology)

Background

Sensor data can be retrieved online in a standardized way from so-called Sensor Observation Services (SOS). Requests can be send to these services to receive data, as defined by the OGC Sensor Web Enablement standards.

However, every SOS has a certain spatial, thematic and temporal extent. Sensor observation services could therefore complement each other, especially when different kinds of sensor data are required for data fusion.

Goal

This thesis aims to explore to what extent the semantic web can add usefull functionality to Sensor Observation Services. The focus lies on three functionalities: discovery of sensor data sources, integration of sensor data and aggregation of sensor data. The goal is to create two automated processes:

  1. Generate linked data from metadata inside a SOS and publish it in a SPARQL endpoint (see folder 'WPS1')

  2. Retrieve, integrate and aggregate data from all relevant sources for a sensor data request, using the semantic web (see folder 'WPS2')

Methods

A catalogue service can be used to discover a SOS, but the semantic web has a number of characteristics that could make it a good alternative. The metadata is explicitly defined on the web, with a multitude of links to related data. The semantic web can therefore be crawled to find the data sources that are relevant, instead of making a specific request to a catalogue service at a specific URL.

When different data sources are being used the data needs to be integrated. From the multiple responses a single data set has to be created to return to the user. The semantics make sure that data about the same observed property or data created by the same procedure are grouped together.

The aggregation of sensor data can be simplified using semantics. For spatial aggregation users don't have to provide a geometry in the query, a name or other descriptive term suffices. For example, aggregation per EEA reference grid cell of 10km2 covering the Netherlands can be translated to a SPARQL query to retrieve the required geometries for aggregation.

Data

A number of data sets are converted to linked data (see folder 'LinkedData'). They will be used for the proof of concept implementation.

  1. Dataset of municipalities in the Netherlands and Belgium

  2. Dataset of Provinces in the Netherlands and Belgium

  3. Dataset of land cover in the Netherlands and Belgium (from CORINE 2012)

  4. Dataset of EEA reference grid cells covering the Netherlands and Belgium with a resolution of 100km2 and 10km2.

Two sensor observation services are being used for the proof of concept implementation: the air quality SOS by the RIVM and the air quality SOS by ircel-celine.