Write a script which generates one scenario variant for each replicate in the ensemble dataset

Question

Write a script which generates one scenario variant for each replicate in the ensemble dataset

Closed this issue 5 years ago · 3 comments

Child of #353

The first step is to format the data into the correct format.

Using the current file store, one file should be generated per variant - so ensemble replicate in this case, with one column per output, and a column for year
The files should be named with the replicate identifier - and then added to scenario configuration with a meaningful variant name

When generating a batch model run (see issue #363), each model run in the ensemble will include data from the relevant variant in the scenario.

Answer 1 · 2019-04-26T17:25:07.000Z

I'd like to summarize my understanding of the issue, because it is not totally clear to me yet.

Let the ensemble dataset be composed of 99 .csv files weather_at_home/temperature_energy_demand/t_max__NF[1-99].csv

$ head -10 t_max__NF1.csv
region,t_max,timestep,yearday
E06000001,13.2,2015,0
E06000001,11.8,2015,1
E06000001,5.7,2015,2
E06000001,5.0,2015,3
E06000001,8.5,2015,4
E06000001,7.6,2015,5
E06000001,10.1,2015,6
E06000001,9.0,2015,7
E06000001,12.8,2015,8

each file describing a specific scenario variant for the output temperature_max of scenario weather_at_home.

Prior to the execution of a model run with timesteps [2015,2020,2025], we aim at generating a new ensemble of 99 .csv files of the type

$ cat data/scenarios/tmax_01.csv
timestep,region,t_max
2015,E06000001,x
2015,E06000002,x
...
2020,E06000001,x
2020,E06000002,x
...

that feed into a scenario of the type

$ cat weather_at_home.yml
name: weather_at_home
description: The weather over the UK
provides:
  - name: temp_min
   ...
  - name: temp_max
    ...
  - name: solar_radiation
    ...
  - name: wind_speed
    ...
variants:
  - name: replicate_01
    description: 
    data:
      temp_min: t_min_01.csv
      temp_max: t_max_01.csv
      solar_radiation: rsds_01.csv
      wind_speed: wss_01.csv

   - name: replicate_02
     ...

Each of these variants is then involved in a specific model run file of the type

$ cat my_model_run_01.yml
name: my_model_run_01
description:  Energy demand under weather_at_home scenario, variant 01 of 99
stamp: "2017-09-18T12:53:23+00:00"
timesteps:
- 2020
- 2020
- 2025
sos_model: my_sos_model
scenarios:
  ...
  weather_at_home: replicate_01
  ...
narratives: {}
<snip>

We would then have to reduce the data in the ensemble dataset (which come at a one-day resolution) to a per-year value. Max of temp_max over the year for the maximum temperature for instance ?

Answer 2 · 2019-04-29T08:29:50.000Z

Hi Thibault. The dataset in weather_at_home/temperature_energy_demand/t_max__NF[1-99].csv is in the correct format, so no need to reshape the files. We (@eggimasv @tomalrussell) have developed a script to get the weather@home data into the csv format that smif requires. Otherwise, your terminology is spot on - each numbered file contains one scenario variant. Each column in the file represents a scenario output. And the scenario is weather@home for RCP 8.5.

And the rest of your description is correct, except for the need for a reduction. The one-day resolution data is passed into the system of systems as is, there is no extra processing required.

Answer 3 · 2019-06-13T10:42:04.000Z

Closed by #381