clarity-h2020/table-components

Implement HC Table

Closed this issue · 31 comments

Status:

  • GUI implementation ("html wireframe") as React AJAX web Application following the respective Product Mock-Ups
  • Integration into the respective GL-Step as External ReactMount Application
  • Design of a simplified internal state data model in JSON (the "table model") and definition of static JSON constants (example) as initial reference / example data based on the example table content from the mock-up screens
  • the table contains the content for DC1 Reference study (The content will be retrieved from the table-state-rest-api)
  • Mapping of the actual Data Package Data Model to the internal JSON state data model with help of the JSON:API and/or REST Views and additional REST Micro Services for simple data aggregation and transformation.

Implement GUI following the Mock-Up and the approach described here.

HC Table GUI

The table compares current conditions against three future scenarios (or 9 scenarios, if we take time periods into account)

  • Column 2 is the hazard related to the baseline scenario (current climate conditions)
  • columns 3-5 are the hazard for three different RCPs for one selected time period (e.g. 2020-2050).

What is missing in the mock-up are the different time periods (3 time periods). A drop-down select-box could be added. Selecting a different time period changes columns 3-5 only!

Initial Mock-Up
grafik

HC Table state model

  • Column 1: Hazard, possibly grouped! So maybe add an empty row for group name
  • Column 2: It can be low, medium, high or an absolute numerical value, depending on the hazard! In case of a numerical value, the unit should be shown (e.g. in header row). See clarity-h2020/data-package#8 (comment).
  • columns 3-5: low,medium, high
  • legend: example in mock-up invalid! low, medium, high has different meaning in column 2 than in columns 3-5! ignore colors in column 2, to be discussed later. legend for column 3-5: is low / medium / high increase with respect to the baseline climate

Correct Mock-Up (explanation here)
hazard_table_new

Example for grouping:
hazard_table_new

Mapping Data Package to HC Table state model

tbd @p-a-s-c-a-l
Table content is not part of the Data Package Meta-Data stored in CSIS! Data has to be obtained from WCS and has to be aggregated and normalised! Find possibilities for delegating this task, e.g. METEOGRID could implement a simple REST API that returns JSON compliant to the state model.

@ghilbrae @luis-meteogrid: An example for the table state model:

{
    "data": [{
            "hazard": "Heat Wave Duration",
            "baseline": "Medium",
            "earlyResponseScenario": "Low",
            "effectiveMeasuresScenario": "Medium",
            "businessAsUsualScenario": "High",
            "period": "2030-2040"
        },
        {
            "hazard": "River Flooding",
            "baseline": "Medium",
            "earlyResponseScenario": "Heigh",
            "effectiveMeasuresScenario": "Medium",
            "businessAsUsualScenario": "Medium",
            "period": "2030-2040"
        }, {
            "hazard": "Heat Wave Duration",
            "baseline": "Medium",
            "earlyResponseScenario": "Low",
            "effectiveMeasuresScenario": "Medium",
            "businessAsUsualScenario": "High",
            "period": "2041-2070"
        },
        {
            "hazard": "River Flooding",
            "baseline": "Medium",
            "earlyResponseScenario": "High",
            "effectiveMeasuresScenario": "Medium",
            "businessAsUsualScenario": "Medium",
            "period": "2041-2070"
        }
    ]
}

We've read the issue and we need to analyse it further before committing to it.

When you say

Data has to be obtained from WCS and has to be aggregated and normalised!

do you mean that all this data will be available from the layers that are going to be uploaded to geoserver? Will there be other sources?

do you mean that all this data will be available from the layers that are going to be uploaded to geoserver?

I hope so. :) Data is on your server, did you check if the properties needed to construct the table are available in the hazard datasets? @therter was not able to request any other format from than geotiff from the WCS (timeout). For the normalisation (low, medium, high) you'll need information on the thresholds which will probably be part of the Data Package Meta-Data.

current status:

  • GUI implementation ("html wireframe") as React AJAX web Application following the respective Product Mock-Ups
  • Integration into the respective GL-Step as External ReactMount Application
  • Design of a simplified internal state data model in JSON (the "table model") and definition of static JSON constants (example) as initial reference / example data based on the example table content from the mock-up screens
  • the table contains the content for DC1 Reference study
  • Mapping of the actual Data Package Data Model to the internal JSON state data model with help of the JSON:API and/or REST Views and additional REST Micro Services for simple data aggregation and transformation.

Sorry for the radio silence.

We are still a bit confused about what it is expected from our side. So I will try to explain what we have understood and what questions we have so you can clarify anything:

  • The idea is to have some kind of tool (API?) that queries the layers available in geoserver to return a JSON response with the data needed.
  • The tool will be called by some module of the CSIS with the set of parameters/data/whatever that is needed according to the example above.
  • The tool will return the information needed back.
  • We will probably implement this using python or some python-related framework.
  • Where is this tool supposed to be installed?
  • How will authentication work?
  • The idea is to have some kind of tool (API?) that queries the layers available in geoserver to return a JSON response with the data needed.

Table Component RIA is the client that queries the API.

  • The tool will be called by some module of the CSIS with the set of parameters/data/whatever that is needed according to the example above.

yes, e.g. similar request like map component with bbox, layers and 'styling' (~threshold calculation function) to be aggregated.

  • The tool will return the information needed back.

yes in JSON, ideally according to the format in this issue.

  • We will probably implement this using python or some python-related framework.

O.K. Example Implementation: the tool/service calls the meteogrid geoserver (WCS), fetches the coverages, performs the aggregation (for the whole study area indentified by bbox sent by the client) and calculates the thresholds (simple calculation function should be sent as parameter by the client).

  • Where is this tool supposed to be installed?

Ideally as docker container on AIT Server, but during development phase it can be deployed on meteogrid server. Layers from ATOS + METEOGRID Geoservers should eventually be consolidated in containerised CSIS GeoServer. So we'll move all services (except EMIKAT) into the container infrastructure.

  • How will authentication work?

ATM IMHO not needed. There's ATM also no authentication when accessing EE/HC/HC-LE layers on meteogrid geoserver. Since the data is public and there are no paid/restricted expert data packages available, it's o.k.. Later all APIs (including AIT's EMIKAT REST API) should support SSO.

My suggestion to them was to implement it as a WPS process.

Since we already have Geoserver deployed, it is straightforward to deploy the wps plugin. Then you only need to implement the functionality either in Java or Python and you get a “standardized API” (i.e., WPS specification) with our custom parameters.

@p-a-s-c-a-l
I just wanted to inform you that after discussing the issue with @luis-meteogrid , I'll be in charge of this task and I will be fully dedicated to it in the following weeks.

@therter @p-a-s-c-a-l
Could you please send me an example of the query I should expect to get from table component?

@ghilbrae @maesbri can you define the query together. Because Miguel knows the format of the thresholds and Angela knows what she need to calculate the response.
I will just use the defined query within the table component.

Thresholds in the data package are defined like this (for each of the indexes you can define an arbitrary number of thresholds (and name them)):

    "threshold": [
      {
        "name": "low",
        "lower": "to-be-defined"
      },
      {
        "name": "medium",
        "lower": "to-be-defined",
        "upper": "to-be-defined"
      },
      {
        "name": "high",
        "upper": "to-be-defined"
      }
    ]

I think the service should receive a list of index layers and for each of them such list of thresholds

This other example is related to calculating the thresholds based on a baseline/reference:

    "threshold": [
      {
        "name": "low",
        "lower": "to-be-defined",
        "relative_to": "baseline"
      },
      {
        "name": "medium",
        "lower": "to-be-defined",
        "upper": "to-be-defined",
        "relative_to": "baseline"
      },
      {
        "name": "high",
        "upper": "to-be-defined",
        "relative_to": "baseline"
      }
    ]

Thanks @maesbri
So I should expect a layer (or list of layers) and the thresholds definitions. What about the simple calculation function that @p-a-s-c-a-l mentioned #1 (comment)? I suppose the threshold should be extracted from the layers applying this function, isn't it?

For information on how thresholds are calculated see clarity-h2020/data-package#8 (comment)
This should already be expressed in the json definition: #1 (comment)

IMHO the REST service needs the bbox of the study area, the unique ids of the HC, HC-LE or EE layers and in case of HC the thresholds definition JSON. HC thresholds are calculated relative to baseline, therefore it would make sense to have everything in in a database (see my mail on DEV).

OK, let's see if I'm on the right track.

The REST service will receive a request similar to this:

hazards = [{
    "type": "eu-gl:hazard-characterization",
    "layers": [{
            "layer_id": "clarity:Heat_wave_temperature_historical_hight_hazard_Naples",
            "bbox": [4647500.0, 1947000.0, 4720500.0, 2008000.0],
            "thresholds": [
                {
                    "name": "low",
                    "lower": "to-be-defined"
                },
                {
                    "name": "medium",
                    "lower": "to-be-defined",
                    "upper": "to-be-defined"
                },
                {
                    "name": "high",
                    "upper": "to-be-defined"
                }
            ]
        }
    ]
}]

Once it gets this it will query the geoserver layers and determine what values to send. On this issue I have two questions:

  1. The calculation, which according to clarity-h2020/data-package#8 (comment) should be something like:

100 x [(future layer) - (baseline layer)] / (baseline layer)

will be performed by the API (in which case the baseline_layer id should also be sent with the request) or will the value be included in the request as seems to be the case with the structure from #1 (comment) ?

  1. As per #1 (comment) the REST service will return final "values" so the data retrieved from the geoserver has to be aggregated for the bbox of the study area, is this correct?

The calculation, which according to clarity-h2020/data-package#8 (comment) should be something like:

100 x [(future layer) - (baseline layer)] / (baseline layer)
will be performed by the API (in which case the baseline_layer id should also be sent with the request) or will the value be included in the request as seems to be the case with the structure from #1 (comment) ?

The thresholds will be extracted from the data-package and sent to the REST Service

As per #1 (comment) the REST service will return final "values" so the data retrieved from the geoserver has to be aggregated for the bbox of the study area, is this correct?

This is correct

@therter I need to know a bit about the use case of this API.

My main concern is that requesting the layers, if the bbox is big and/or if there are many layers, the response will take some time. So I'd like to know if this is something that will be used sparingly and the information will be stored by the CSIS so the user doesn't have to wait for an answer or if this will be requested and displayed on-the-fly.

@ghilbrae It was planned, that this will be requested and displayed on-the-fly. But the study area is limited to 500km² at the moment. Takes a request with a bbox of this size also much time?

@therter as an example I'm using this request:

{
    "type": "eu-gl:hazard-characterization",
    "bbox": [4647500.0, 1947000.0, 4720500.0, 2008000.0],
    "hazards": [{
        "hazard": "Heat Wave Duration",
        "baseline_thresholds": [
            {
                "name": "low",
                "lower": "13.47"
            },
            {
                "name": "medium",
                "lower": "13.47",
                "upper": "15.90"
            },
            {
                "name": "high",
                "upper": "15.90"
            }
        ],
        "future_thresholds": [
            {
                "name": "low",
                "lower": "9.13"
            },
            {
                "name": "medium",
                "lower": "9.13",
                "upper": "16.6"
            },
            {
                "name": "high",
                "upper": "16.6"
            }
        ],
        "layers": [{
            "time-period": "2011-2040",
            "layer_ids": {
                "baseline_layer_id": "clarity:Tx75p_consecutive_max_EUR-11_ICHEC-EC-EARTH_historical_r12i1p1_SMHI-RCA4_v1_day_19710101-20001231_netcdf3",
                "rcp26_layer_id": "clarity:Tx75p_consecutive_max_EUR-11_ICHEC-EC-EARTH_rcp26_r12i1p1_SMHI-RCA4_v1_day_20110101-20401231_netcdf3",
                "rcp45_layer_id": "clarity:Tx75p_consecutive_max_EUR-11_ICHEC-EC-EARTH_rcp45_r12i1p1_SMHI-RCA4_v1_day_20110101-20401231_netcdf3",
                "rcp85_layer_id": "clarity:Tx75p_consecutive_max_EUR-11_ICHEC-EC-EARTH_rcp85_r12i1p1_SMHI-RCA4_v1_day_20110101-20401231_netcdf3"
                }
            },
            {
            "time-period": "2041-2070",
            "layer_ids": {
                "baseline_layer_id": "clarity:Tx75p_consecutive_max_EUR-11_ICHEC-EC-EARTH_historical_r12i1p1_SMHI-RCA4_v1_day_19710101-20001231_netcdf3",
                "rcp26_layer_id": "clarity:Tx75p_consecutive_max_EUR-11_ICHEC-EC-EARTH_rcp26_r12i1p1_SMHI-RCA4_v1_day_20410101-20701231_netcdf3",
                "rcp45_layer_id": "clarity:Tx75p_consecutive_max_EUR-11_ICHEC-EC-EARTH_rcp45_r12i1p1_SMHI-RCA4_v1_day_20410101-20701231_netcdf3",
                "rcp85_layer_id": "clarity:Tx75p_consecutive_max_EUR-11_ICHEC-EC-EARTH_rcp85_r12i1p1_SMHI-RCA4_v1_day_20410101-20701231_netcdf3"
                }
            }
        ]
    }]
}

This takes 309316.40 ms to complete. If I change the bbox to "bbox": [4466000.0, 2130000.0, 4966000.0, 2630000.0], the time is 267094.20 ms and for "bbox": [4466000.0, 2130000.0, 4468500.0, 2132500.0], the smallest one and roughly 500km2, is 265122.50 ms
I've been also testing some layers with less information, such as clarity:Heat_wave_temperature_historical_hight_hazard_Naples and it takes less time, I suppose that the fact that these ones only contain data for Napoli and not for the whole of Europe (as the ones form the example) have something to do with these times.

Note that I'm mainly using WCS requests to extract data from geoserver.

I'll try to test a bit more and see if there's some way to reduce these times in some other way.

Hi,
Some comments from my side:

  • "hazard": "Heat Wave Duration", this is not a hazard type but a hazard index, and it is not following the identifier convention (i.e, hazard:temperature:heat:extreme-heat:index:hot-days (for the hazard index) or hazard:temperature:heat:extreme-heat (for the hazard type).
  • "rcp26_layer_id", and so forth .... this naming seems to me a little bit static ... Wouldn't it better to have additional properties in the request where you indicate the emissions_scenario id and the layer_id? That way it would be more flexible if in the future we want to add for instance additional rcp's

btw, my proposal of implementing directly the service WITHIN geoserver was precisely to avoid having to request the data through the WCS protocol since you already have access to the data itself, skipping thus the time required to download it (I guess most of the time consumed you indicated above is due to the cutting and serving of the data through the wcs request)

btw2, in the thresholds we also have an "expression" property, which better defines how the threshold is applied since just having the "lower" or "upper" values is not enough to interpret how it should be applied. In addition, using an expression allows more flexibility, as we can state not only "lower <= value <= upper" but also things like: "2*lower <= value/2.0 <= value + upper", which is not possible by just having the lower and upper values in the threshold.

Some comments from my side:

  • "hazard": "Heat Wave Duration", this is not a hazard type but a hazard index, and it is not following the identifier convention (i.e, hazard:temperature:heat:extreme-heat:index:hot-days (for the hazard index) or hazard:temperature:heat:extreme-heat (for the hazard type).

This was taken from #1 (comment) I'm more than happy to use any convention or whatever the CSIS is expecting.

  • "rcp26_layer_id", and so forth .... this naming seems to me a little bit static ... Wouldn't it better to have additional properties in the request where you indicate the emissions_scenario id and the layer_id? That way it would be more flexible if in the future we want to add for instance additional rcp's

btw2, in the thresholds we also have an "expression" property, which better defines how the threshold is applied since just having the "lower" or "upper" values is not enough to interpret how it should be applied. In addition, using an expression allows more flexibility, as we can state not only "lower <= value <= upper" but also things like: "2*lower <= value/2.0 <= value + upper", which is not possible by just having the lower and upper values in the threshold.

TBH, as I stated in a previous comment, I need some sort of example request to start working, and as no one seemed to know or have thought about a request type, I've created my own based on what is expected (#1 (comment)). It would be great to have more info on use cases and stuff that may go in the request because I cannot account or know about every possibility.

btw, my proposal of implementing directly the service WITHIN geoserver was precisely to avoid having to request the data through the WCS protocol since you already have access to the data itself, skipping thus the time required to download it (I guess most of the time consumed you indicated above is due to the cutting and serving of the data through the wcs request)

You are right, most of the time is related with the data download. I've been going through the geoserver documentation to find out a better way to implement this, but I've found nothing for now. Also, I don't think I'll be able to implement an additional module for geoserver in a reasonable time, mainly because that would need me to understand how geoserver is built and also to use Java, which I do not know. If you have any suggestion or know of some extra documentation that I can read I'm more than happy to take a look at that.

For the last point, I think you can use python to implement your own wps process but I've never done it (many years ago I did it, but using Java).
Maybe it is also possible to implement a wcs function (similarly as it is done with wfs functions listed in the GetCapabilities) also using python.

@therter you can check the request example for HC in here: https://github.com/clarity-h2020/table-state-rest-api/blob/master/examples/request.json

We can continue any discussion regarding that request on the issues for that repo and leave this issue for more general stuff if that's OK for everyone.

I've not yet installed that API on a server but I hope to have it by tomorrow so you may run some tests.

The example request of the table-state-rest-api is now used

Currently, the URL to the table-state-rest-api is hard-coded. How can this work for different Hazard resources? Why is this not loaded from the Data Package? The URl of the REST API should be stored in the HC Resource in the corresponding Data Package. We can use the new references field and URL of type @tableview:meteogrid:rest analogous to the @mapview:ogc:wms type used by the Map Component. See also #16

To be able to support different hazards, the table-rest-api has to support parameters, e.g. URL

I need to know how is this going to be handled to update the API accordingly and know what sort of request to expect from the CSIS. I considered some options in the issue already referenced clarity-h2020/table-state-rest-api#3

@therter Is the bbox hardcoded, too? Will this table work for the cities of Alba Iulia, Agios Dimitrios and Bottrop ?

Is the bbox hardcoded, too? Will this table work for the cities of Alba Iulia, Agios Dimitrios and Bottrop ?

The table reads the bbox from the study. The layers are hardcoded and must be hosted by the meteogrid wms and must contain the data for the cities of Alba Iulia, Agios Dimitrios and Bottrop.
The currently used layers do not exist anymore. So I have to adjust the layers that the table work again.

The currently used layers do not exist anymore

The layers are back and the table is working again