March-June 2022, merlijn
All of the code used to import, analyse, and visualise the data in the report below can be found in this repository. There are two versions of the code. v0.1.0
consists of some initial scripts used to create simple load duration graphs. It has limited functionality beyond producing load duration graphs based on the original season and daynite timeslices. This version was produced in April 2022.
v0.2.0
(May 2022) and v0.3.0
(June 2022) brought a more fully-fledged suite of scripts with additional features and generalised functions. v0.2.0
was used in the report below and was released in mid-May 2022. v0.3.0
added UTC support and included consumer load profiles. The basic structure is as follows:
load.py
: these functions are used to import and clean the data,assign.py
: this script contains functions that assign the various timeslices to the time series,analyse.py
: these functions generate visualisations and statistics along with other analysis functions,
All these functions can be run from the main.py
file or individually in their respective scripts. The idea behind organising the functions and scripts like this is to make it easier to maintain and extend the functions. To this end, most functions are written in a very generalised manner, which should make it easy to add support for additional timeslices, time series, and other improvements. An important factor here is that the functions are (mostly) agnostic to the timeslice or time series used, which means that it is straightforward to add new timeslices (such as the ones mentioned in the discussion) or to use the current functions to analyse new time series.
A fourth script, report_visualisations.py
, listed in the graphs
folder, contains some rough-and-ready code to generate some of the visualisations used in this report.
I am working on 'translating' some key functions from python
to R
(see r-translation
branch), and on adding more comments to the code to explain its functionality.
This documentation was written for version v0.2.0
of the code and updated for v0.3.0
with an appendix on consumer load profiles.
TIMES uses timeslices to represent periodic variations in commodities and process input/outputs. The current version of the TIMES-CZ model uses twelve timeslices: four seasonal timeslices (spring, summer, autumn, winter) and three daynite timeslices (night, day, peak), see tables 1, 2 and 3. Increasing the number of timeslices increases the accuracy of the model, as there are annual variations in the data that have a different frequency than the daily and seasonal variations that are currently captured by the timeslices.
This report builds on Josef's bachelor thesis, where he worked on finding the best method for determining which timeslice load values to use. This work is different in that it purely looks at different types of timeslices and combinations thereof. Milan and Lukáš suggested some alternative timeslices (see Method). The basic idea is to test different types of timeslices on load, export, import and price data to see which combination of timeslices we can use to enrich the TIMES-CZ model. In order to do this, I wrote a suite of functions that can test different timeslices, in order to determine which combination of timeslices would best enrich the model.
When picking a combination of timeslices, the primary concerns are whether they capture intra/inter-annual variability in the data and thus accurately convey major trends, and the computational and data-gathering cost and of this increased accuracy. Generally speaking, as time resolution increases, the accuracy of the model increases too but so does its computational complexity. Furthermore, it becomes more difficult to gather economic statistics at the same resolution, which reduced the efficacy of the increase timeslice resolution. The data considered in this report (load, consumption, import, export, prices) can be known to a high temporal precision, whereas others (such as production of specific factories, or solar output) can either not be known at such a high temporal resolution or have large year-on-year variability. The goal of this report is to contribute to finding the right trade off between temporal resolution and computational and statistical complexity.
Code | Season | Length (days) | Description |
---|---|---|---|
R | Spring | 78 | 15 March-31 May |
S | Summer | 91 | 1 June-30 August |
F | Autumn | 77 | 31 August-15 November |
W | Winter | 119 | 16 November-14 March |
Table 1: Definition of the seasonal timeslices.
Code | Daynite | Length (hours) | Description |
---|---|---|---|
N | Night | 12 | all hours between 20:00-08:00. |
D | Day | 11 | all hours between 08:00-20:00, except the peak hour. |
P | Peak | 1 | the hour with the highest load during the day. |
Table 2: Definition of the daynite timeslices.
Season | Daynite | TS | Annual share (%) |
---|---|---|---|
Spring (R) | Night (N) | RN | 10.7 |
Day (D) | RD | 9.80 | |
Peak (P) | RP | 0.980 | |
Summer (S) | Night (N) | SN | 12.5 |
Day (D) | SD | 11.4 | |
Peak (P) | PD | 1.04 | |
Fall (F) | Night (N) | FN | 10.5 |
Day (D) | FD | 9.67 | |
Peak (P) | FP | 0.879 | |
Winter (W) | Night (N) | WN | 16.3 |
Day (D) | WD | 14.9 | |
Peak (P) | WP | 1.36 |
Table 3: Annual share of the current timeslices.
The following data sources were used:
Data | Source | Scope |
---|---|---|
Load | https://www.ceps.cz/en/all-data#Load | 2015-2021, hourly average values. |
Generation | https://www.ceps.cz/en/all-data#Generation | 2015-2021, hourly average values. |
Imports and exports | https://www.ceps.cz/en/all-data#CrossborderPowerFlows | 2015-2021, hourly average values. |
Prices | https://www.ote-cr.cz/en/statistics/yearly-market-report | 2015-2021, hourly day-ahead price. |
Consumer load profiles | https://www.ote-cr.cz/en/statistics/electricity-load-profiles/ | 2015-2021, hourly normalised load. |
Table 4: Data sources used in this report.
ČEPS is the Czech transmission system operator and OTE-ČR is the Czech electricity and gas market operator. 2015 is the base year of the TIMES-CZ model and 2020 is a milestone year; 2021 is the last year for which full data is available. Hourly (average) values were the highest available temporal resolution available. All of the original data files can be found in the data folder on the GitHub repository.
The data providers use different time and date definitions, see table 5. The ČEPS data gives the hour and date in CET/CETS (local time), whereas OTE-ČR gives the local date and counts the hour of the day. For both data files, the data was converted to UTC which was then converted to the Czech timezone (CET/CETS).
Data provider | Data files | Date time format | Hours: normal day | Hours: winter time to summer time | Hours: summer time to winter time |
---|---|---|---|---|---|
ČEPS | load, generation, cross border | 31.12.2015 00:00 | 0-23 | 0-1, 3-23 | 0-2, 2-23 |
OTE-ČR | prices, consumer load profiles | 31-12-2015, 1 | 1-24 | 1-23 | 1-25 |
Table 5: Date time conventions of both data providers.
Some minor cleaning and reshaping actions were performed on the data (see load.py
). This included calculating total generation, total import, total export, and net export. The consumer load profile data needed more extensive editing, please see the appendix for more information. The data sources were combined into a single data file, combined 2015-2021.csv
, which can also be found in the data folder on the GitHub repository.
The load, import, export and price data were then categorised into the two original time slices (season daynite) and six new timeslices:
- Month: the month of the date (January, February, ..., December),
- Weekday 1: whether the date was a working day (generally, Monday-Friday) or a weekend or public holiday (generally, Saturday-Sunday),
- Weekday 2: the day of the week (Monday, Tuesday, etc),
- Extended daynite 1: one night time slices, three day time slices. See table 6 for the chosen cut-off times.
- Extended daynite 2: two night time slices, six day time slices. See table 7 for the chosen cut-off times.
- Hour: hourly timeslices (00:00, 01:00, ..., 23:00).
As discussed in the preface, the functions that perform these are agnostic as to what time series they are categorising which means that other variables can be categorised in the same way, and that functions with new timeslices can easily be added.
Extended daynite 1 | Description |
---|---|
Night | 20:00-05:00 |
Morning | 06:00-09:00 |
Afternoon | 10:00-15:00 |
Evening | 16:00-19:00 |
Table 6: Definition of the extended daynite 1 timeslices.
Extended daynite 2 | Description |
---|---|
Night-1 | 20:00-01:00 |
Night-2 | 02:00-07:00 |
Day-1 | 08:00-09:00 |
Day-2 | 10:00-11:00 |
Day-3 | 12:00-13:00 |
Day-4 | 14:00-15:00 |
Day-5 | 16:00-17:00 |
Day-5 | 18:00-19:00 |
Table 7: Definition of the extended daynite 2 timeslices.
Code | Weekday | Length | Description |
---|---|---|---|
L | Working day | 68.7% | Any working day (most weeks: Monday to Friday). |
H | Weekend day | 31.3% | Weekend days (Saturday and Sunday) and public holidays. |
Table 8: Definition of the weekday timeslices. The length of the timeslices, expressed as an annual percentage, is calculated over 2015-2021 but can vary year-to-year due to differences of the date of public holidays. Note that the length of the working day/weekend day slices is not exactly 5/7 and 2/7 (71.4% and 28.6%), respectively, due to the inclusion of public holidays as weekend days.
These timeslices were then combined to into groups, based on my judgement of which combinations would produce interesting and usable results. This was broadly done by including at least one long-term timeslice (season, month) and at least one short-term timeslice (daynite, hour, etc). Not all groups were analysed extensively due to time constraints.
Name | Group | Size |
---|---|---|
Null | Season - daynite | 12 (4 × 3) |
A | Season - extended daynite 1 | 16 (4 × 4) |
B | Season - extended daynite 2 | 24 (4 × 8) |
C | Season - hour | 96 (4 × 24) |
D | Season - weekday 1 - daynite | 24 (4 × 2 × 3) |
E | Season - weekday 1 - extended daynite 1 | 32 (4 × 2 × 4) |
F | Season - weekday 2 | 28 (4 × 7) |
G | Season - weekday 2 - daynite | 84 (4 × 7 × 3) |
H | Month - daynite | 36 (12 × 3) |
I | Month - extended daynite 1 | 48 (12 × 4) |
J | Month - extended daynite 2 | 96 (12 × 8) |
K | Month - weekday 1 | 24 (12 × 2) |
L | Month - weekday 1 - daynite | 72 (12 × 2 × 3) |
M | Month - hour | 288 (12 × 24) |
Table 9: The different timeslice groupings considered in this report. The Null group is the current timeslices group used in the TIMES-CZ model. 'Size' gives the total number of timeslices in the group.
Load, import, export, and price duration graphs were created for each group of timeslices, and the key statistics of their distribution were recorded. When comparing the suitability of different groups of timeslices, the main focus was to determine whether these timeslices accurately captured the major periodic trends in the data, and whether the individual timeslices created a unique view of the data: in other words whether there were any other timeslices within the same group that provided a similar view of the data, therefore leading to redundencies.
To determine whether different timeslice groupings yielded similar results, I also looked at the distribution of data points within timeslices. I did this by plotting the major statistics (mean, standard deviation, major percentiles) and seeing to what extent different timeslices presented a 'unique' view of the data and to what extent the mean value was representative of that timeslice (i.e. whether there was excessively large variance about the mean).
To aid the graphical analysis of the data, I performed a Fourier transforms of the load, import, export, and price data (using the fast Fourier transform algorithm). This decomposes the time series into a frequency spectrum, allowing you to see which frequencies (and thus processes) dominate the time series.
Figure 1: This figure shows the hourly annual load in 2021 (shaded grey) as well as the 20-day rolling average (blue line). This shows how one of the main frequencies driving the load is the annual (as well as daily) variance.
Figure 2: This data shows the average daily load from 2015-2021 (blue line) as well the 25%-75% percentile range (shaded grey)
Figure 3: Same data as in figure 2 but now split by season (see table 1), to give an idea how the different timeslices affect the distribution of the data.
Figure 4: This figure shows the results of the fast Fourier transform analysis for each variable. On the left, it shows the twenty dominant frequencies lower than 30 days, on the right it shows the twenty dominant frequencies higher than 30 days. Note that the spectrums are normalised. This allows for easier comparison of dominant frequencies across the different time series.
The plots for the consumer load profile FFTs are not given here, but can be found on Github. Compared to the other variables, the consumer load profiles feature much stronger dominant frequencies and therefore comparatively much weaker frequency spread. The dominant frequencies are daily, annually, and weekly; with intra-year frequencies being almost absent.
Figure 5: This figure shows an example of the distribution graphs, in this case for group season - weekday 1 - daynite (group D). The blue dot is the mean value, the smaller blue dots give the mean plus/minus the standard deviation. The red dot gives the median value, the red crosses give the minimum and maximum values, and the 10% and 90% percentiles.
My recommendation is that we add a weekday timeslice to the model. This timeslice enriches the data as it allows us to capture weekly variance in the data, without adding much computational complexity or complicating data gathering. Figure 4 shows that the load, export, and import values have a strong weekly and intraweekly (7 days and 3.5 days) frequencies, but these are currently not captured in the daynite and season timeslices. This new group does, partially, capture them. Furthermore, other economic statistics on working day/weekend day consumption and production patterns should be readily available. Using a more granular seven-day weekday timeslice (weekday 2) increases computational costs (4 × 7 × 3 = 84 vs 4 × 2 × 3=24) without increasing accuracy and is therefore not recommended.
I suggest using the abbreviation 'L' for working days (labour) and 'H' for weekend days (holiday). Using these conventions, the new timeslices and their annual share are:
Season | Weekday | Daynite | TS | Annual share (%) |
---|---|---|---|---|
Spring (R) | Working day (L) | Night (N) | RLN | 7.28 |
Day (D) | RLD | 6.67 | ||
Peak (P) | RLP | 0.607 | ||
Weekend day (H) | Night (N) | RHN | 3.50 | |
Day (D) | RHD | 3.23 | ||
Peak (P) | RHP | 0.293 | ||
Summer (S) | Working day (L) | Night (N) | SLN | 8.78 |
Day (D) | SLD | 8.05 | ||
Peak (P) | SLP | 0.732 | ||
Weekend day (H) | Night (N) | SHN | 3.28 | |
Day (D) | SHD | 3.50 | ||
Peak (P) | SHP | 0.318 | ||
Fall (F) | Working day (L) | Night (N) | FLN | 7.40 |
Day (D) | FLD | 6.78 | ||
Peak (P) | FLP | 0.617 | ||
Weekend day (H) | Night (N) | FHN | 3.34 | |
Day (D) | FHD | 2.99 | ||
Peak (P) | FHP | 0.272 | ||
Winter (W) | Working day (L) | Night (N) | WLN | 10.9 |
Day (D) | WLD | 9.97 | ||
Peak (P) | WLP | 0.907 | ||
Weekend day (H) | Night (N) | WHN | 5.02 | |
Day (D) | WHD | 4.61 | ||
Peak (P) | WHP | 0.419 |
Table 10: Annual share of the new proposed timeslices.
The figure below shows the average hourly load for each season-weekday pair, with the mean load per daynite timeslice for the given season and weekday superimposed. It shows how the model simplifies the load data and to what extent it accurately summarises the data, given that the model does not currently capture variance (only mean values).
Figure 6: Daily load, by season and weekday, with average timeslice values superimposed.
In general, any group including monthly, weekday 2 (seven weekdays) or hourly timeslices greatly increased computational complexity without adding much detail to the model, as many months, weekdays and hours had similar loads, cross border exchanges, and prices. This is largely because these timeslices do not attempt to categorise the data (as the current daynite timeslice does) but rather focus on increasing accuracy by increasing the number of timeslices. These approaches also suffer from the need to know other statistics at a much higher temporal resolution in order to be useful, which limits their usefulness in increasing accuracy. Monthly timeslices showed a lot of repetition-it did not really capture more of the yearly variability than the seasonal timeslices, whilst many months had nearly identical average imports, exports and loads (e.g. July and August, June and September, etc). To some extent, this is also true for the Spring and Autumn seasonal timeslices: perhaps these could be merged and omitted.
The data used covers seven years, which should give a good understanding of annual trends in load, export, and import values. However, capturing price developments is more difficult as these are not necessarily periodic, and can see periods of high volatility that are not periodic. There can be such deviations in the other variables too, but due to the long timespan and periodicity these changes are smoothed out. For instance, imports and exports in 2015 were up by 17.2% and 24.5%, respectively, compared to 2016-2021, but by using all seven years these annual deviations are smoothed out.
The best way to test the effect of new timeslices is to add them to the TIMES-CZ model. This is quite a labour intensive process, as other statistics also need to be updated. Some of this could be automated (using the data available, new scenario files for processes like EXPELCHIGGA and IMPELCHIGA could be generated) but it would still require careful setting up of the relevant script, running the model, and analysing the model outputs.
This has not been fully done yet for the proposed season - weekday 1 - daynite group (D) of timeslices, and thus further work is needed to explore the exact impacts on the model of using this new group. Currently, on the EUA-revision branch, I have added a new scenario file CZ_V02-/SuppXLS/Scen_AltTS.xlsx
which incorporates the seaon - weekday 1 - daynite group (D) with a corresponding CZ_V02-/SuppXLS/Scen_CZ_elcprice_SEK_TSprices_sumbnd_AltTS.xlsx
scenario file which replaces CZ_V02-/SuppXLS/Scen_CZ_elcprice_SEK_TSprices_sumbnd.xlsx
(and ultimately, CZ_V02-/SuppXLS/Scen_CZ_elcprice_SEK.xlsx
). However, the temporal resolution of other key economic and energy statistics should also be improved to be able to accurately determine the effect of the new timeslices (see the appendix).
As mentioned previously, many other timeslices remain unexplored. The data gathered suggests that perhaps the spring and autumn seasonal timeslices could be merged, as for some data points they show very similar means and distributions. Alternatively, they could be split and merged; creating an early-Spring/late-Autumn timeslice (capturing the onset/end of Winter) and an late-Spring/early-Autumn timeslice (capturing the onset/end of Summer). Furthermore, the current daynite timeslices could be expanded. The current definition of 'Night' lasting 12-hours appears to be too broad as the reduced demand does not last for so long, perhaps an alternative to the extended daynite timeslices discussed in this report. An alternative definition, extended daynite 3, is given in table 11.
Extended daynite 3 | Description | Length (h) |
---|---|---|
Night | 23:00-05:00 | 6 |
Morning | 06:00-08:00 | 3 |
Day | 08:00-20:00, except: | 11 |
Peak | The hour with the highest load during Day | 1 |
Evening | 20:00-22:00 | 3 |
Table 11: Potential definition for extended daynite 3.
As seen in figure 4, there are a plethora of frequencies between 14 days and 190 days, which are currently not well-understood nor captured by the proposed new combination of timeslices. It is unclear what is driving these cycles. Some may be related to seasonal changes, but the majority cannot easily be transcribed to either daily, weekly, or annual variations in demand or prices. More work is needed to determine what is behind these frequencies and whether they are important to the model.
Lastly, this report only considered electricity production and consumption. Other commodities (most notably natural gas) and consumer behaviour patterns are also of interest. Due to the general nature of the code (see preface), it should be straightforward to expand the analysis scope to include new time series.
The first application of the new timeslices was to add increased temporary resolution for commercial and residential COM_FR
processes. This was done using consumer load profiles from OTE-ČR. The consumer load profile data is divided into the following categories:
Category | Type | Description (translated) |
---|---|---|
1 | Commercial | No electricity for heat or hot water |
2 | Commercial | Heat accumulation (hot water & heating, includes another source of heat) |
3 | Commercial | Electricity for heating including heat pumps |
4 | Residential | No electricity for heat or hot water |
5 | Residential | Heat accumulation (usually, hot water) |
6 | Residential | Electricity for heating (includes another source of heat) |
7 | Residential | Electricity for heating including heat pumps |
8 | Commercial | Public lighting |
Table 12: Consumer load profile categories (Czech source).
The TDD5 category is split in 8 regions. To get a value for this category, I took a population-weighted average, using the population counts in table 13. I combined Czech regions into the regional TTD5 profiles in the following way:
Name (CZ) | Name (EN) | Regions | Pop (,000) | Pop (relative) |
---|---|---|---|---|
Jižní Čechy | South Bohemia | South Bohemian, Vysocina | 1,141 | 0.11 |
Jižní Morava | South Moravia | South Moravian, Zlin | 1,757 | 0.17 |
Praha | Prague | Praha | 1,275 | 0.12 |
Severní Čechy | North Bohemia | Liberec, Usti nad Labem | 1,237 | 0.12 |
Severní Morava | North Moravia | Moravian-Silesian, Olomouc | 1,801 | 0.17 |
Střední Čechy | Central Bohemia | Central Bohemia | 1,387 | 0.13 |
Východní Čechy | East Bohemia | Hradec Kralove, Pardubice | 1,058 | 0.10 |
Západní Čechy | West Bohemia | Karlovy Vary, Plzen | 862 | 0.08 |
Table 13: Division of Czech regions into the TDD5 regions (Population data from ČSÚ (2022)).
Furthermore, based on Lukáš' recommendation, I added a 'zero heating in summer' option (labeled 'a'), which reduces the heating load profiles by 50% in autumn and spring and by 100% in summer. I also added a 'zero lighting in summer' option (labeled 'b'), which reduces the residential lighting load profile (RLIG
) by 50% in autumn and spring, and by 100% in summer but only during day/peak hours. I also awarded some processes a combination of two load profiles. In these cases, the load profiles were averaged as they generally showed similar annual/seasonal variability. TDD4 and TDD5 do differ slightly (TDD5 has consitently higher values during day and peak timeslices) but they exhibit a similar pattern overall.
Next I categorised all timeslice-dependent COM_FR
processes listed in the CZ_V02-/VT_CZ_RCA_V2.2.xlsx
file, see table 14 below. The full name of the processes can be found in table 16.
Category | Processes |
---|---|
1 | CCLE, CCSE, CLIG, CCOK, CCREF |
1, 2 | COEL, COEN |
2 | CWLE, CWSE |
3 | None |
3a | CHLE, CHSE |
4 | RCDE, RCRE, RCRN, RCME, RCMN, RCOK, RCWA, RCDR, RDWA, RREF |
4b | RLIG |
4, 5 | ROEL, ROEN |
5 | RWUN, RWDE, RWRN, RWRE, RWME, RWMN |
6 | None |
6a, 7a | RHDE, RHRE, RHRN, RHUN, RHME, RHMN |
7 | None |
8 | CPLI |
Table 14 The division of the processes into the various categories.
Using the above combinations, I calculated the ratio of the categories per timeslice (the original data is normalised). I then inserted the various processes into the CZ_V02-/SuppXLS/Scen_AltTS.xlsx
scenario file with their category and annual total demand. Using this scenario file does not lead to infeasibilities.
Abbreviation | Description |
---|---|
TAV | Aviation generic |
TBI | Road bus intercity |
TBU | Road bus urban |
TC | Road car short distance |
TFR | Road freight |
TMO | Road motors |
TNA | Navigation generic |
TTF | Rail freight |
TTL | Rail passengers light |
TTP | Rail passengers heavy |
IFT | Food and tobacco demand |
IOI | Other industries demand |
Table 15 These are the remaining COM_FR
processes. I believve that we cannot use consumer load profiles to determine their annual variability (and they do not occur in the CZ_V02-/VT_CZ_RCA_V2.2.xlsx
file).
Abbreviation | Description | Category |
---|---|---|
CHLE | Commercial space heating large | 3a |
CHSE | Commercial space heating small | 3a |
CCLE | Commercial space cooling large existing | 1 |
CCSE | Commercial space cooling small existing | 1 |
CWLE | Commercial water heating large existing | 2 |
CWSE | Commercial water heating small existing | 2 |
CLIG | Commercial lighting | 1 |
CCOK | Commercial cooking | 1 |
CREF | Commercial refrigeration | 1 |
CPLI | Commercial public lighting | 8 |
COEL | Commercial other electricity | 1+2 |
COEN | Commercial other energy generic | 1+2 |
RHDE | Residential space heating single semi-detached existing | 6a+7a |
RHRE | Residential space heating single rural existing | 6a+7a |
RHRN | Residential space heating single rural new | 6a+7a |
RHUN | Residential space heating single urban new | 6a+7a |
RHME | Residential space heating multiple all existing | 6a+7a |
RHMN | Residential space heating multiple all existing new | 6a+7a |
RCDE | Residential space cooling single detached existing | 4 |
RCRE | Residential space cooling single rural existing | 4 |
RCRN | Residential space cooling single rural new | 4 |
RCME | Residential space cooling multiple all existing | 4 |
RCMN | Residential space cooling multiple all new | 4 |
RWUN | Residential water heating single urban new | 5 |
RWDE | Residential water heating single urban existing | 5 |
RWRN | Residential water heating single rural new | 5 |
RWRE | Residential water heating single rural existing | 5 |
RWME | Residential water heating multiple all existing | 5 |
RWMN | Residential water heating multiple all new | 5 |
RCOK | Residential cooking | 4 |
RCWA | Residential clothes washing | 4 |
RCDR | Residential cloth drying | 4 |
RDWA | Residential dish washing | 4 |
RLIG | Residential lighting existing | 4b |
RREF | Residential refrigeration | 4 |
ROEL | Residential other electricity | 4+5 |
ROEN | Residential other energy generic | 4+5 |
Table 16 The full name of the timeslice-dependent COM_FR
processes and their respective category.