Repo of R scripts elaborated for the paper: "The impact of temporal resolution on public transport accessibility measurement" (2019).
The repo of scirpts used for the paper The impact of temporal resolution on public transport accessibility measurement: review and case study in Poland, accepted for publication in the Journal of Transport Geography on 11th January 2019 (submitted: 18th July 2018).
Stepniak, M., Pritchard, J.P., Geurs K.T., Goliszek S., 2019, The impact of temporal resolution on public transport accessibility measurement: review and case study in Poland, Journal of Transport Geography, doi: https://doi.org/10.1016/j.jtrangeo.2019.01.007.
All the data used for the study can be downloaded from Open Data Repository RepOD. Direct link and reference of the dataset:
Stepniak, M., Goliszek, S., Pritchard, J., Geurs, K., 2019. The Impact of Temporal Resolution on Public Transport Accessibility Measurement. [Dataset] RepOD. https://doi.org/10.18150/repod.7727991.
Authors:
- Marcin Stępniak (tGIS, Department of Geography, Complutense University of Madrid, Spain)
- Sławomir Goliszek (Institute of Geography and Spatial Organization, Polish Academy of Sciences)
- John P. Pritchard (Centre for Transport Studies, University of Twente)
- Karst T. Geurs (Centre for Transport Studies, University of Twente)
R01. Compare precision of accessibility measurement
R02 Travel time calculations
R03 Compare travel times
R04 Compare accessibility measures
R05 Combine Gini coefficients
R06 Frequency graph
The input for the repo are origin-destination (OD) travel time matrices which uses census track centroids as origins. All ODs are stored in two subfolders in Data.zip
which can be downloaded from here
Destinations in ODs are:
-
Subfolder
f03_Aggregates
For proximity measure:
Adm
: Local administration office (1 point)Zlob
: Nurseries (30 points)
For cumulative opportunities measure:
Teatr
: Theatres (21 points)SpecHC
: Specialized health centres (169 points)
- Subfolder
f03_Aggregates_Ai
For potential accessibility measure:
HOS
: Hospitals with attached number of beds (9 points)Edu_Lo
: Secondary schools with attached number of classes (68 points)OBWOD
: census track centroids with number of inhabitants (1745 points)
For the details please consult the file Data_description.pdf
which can be found here.
The following sampling procedures were tested for the study:
a) Systematic Sampling departure time selected using a regular interval
b) Simple Random Sampling a specified number of sample times are selected at random (without replacement)
c) Hybrid Sampling departure times are randomly selected from given time intervals (resulted from applied temporal resolution)
d) Constrained Random Walk Sampling 1st departure time is randomly selected within the first time interval, and next ones from subsequent time intervals defined by a temporal resolution (+1 resolution +/- 0.5 temporal resolution).
For details please consult Owen & Murphy (2018).
The detailed description of the script which enables to generate departure time can be found in this repo.
The table below shows applied temporal resolutions and number of iterations required for 1-hour long time window:
resolution | interations |
---|---|
2 | 30 |
3 | 20 |
4 | 15 |
5 | 12 |
6 | 10 |
10 | 6 |
12 | 5 |
15 | 4 |
20 | 3 |
30 | 2 |
60 | 1 |
The repo consists of serveral of subsequent scripts:
The R01_Ai_Calculations.R
script applies different functions, depending on which of accessibility measures is in use. The code selects departure time according to a given sampling method for all considered temporal resolutions. Then it calculates accessibility measures and calculate (aggregated) errors:
-
MAPE (Mean Absolute Percentage Error), expressed in %, calculated according to the formula:
mean(abs((y - x)/y))*100
; -
MAE (Mean Absolute Error), expressed in absolute values (e.g. minutes) calculated according to the formula:
mean(abs(y-x))
; -
maxdif (maximum difference), expressed in absolute values (e.g. minutes) calculated according to the formula:
max(abs(y-x))
;where
x
is an evaluated value, whiley
- a benchmark one.
Additionally, each of the scripts calculate Gini coefficients for all tested temporal resolutions as well as for benchmark values.
Particular functions (seperate for each of the applied accessibility measures) are stored in separate Rscripts:
-
proximity (or travel-time-to-nearest-provider) for public administration
Adm
and nurseriesZlob
. Function stored inR011_Ai_proximity.R
Function syntax:R011_Ai_proximity(file_all, mc_max)
Application of temporal resolution: This script aggregates selected travel times using an arithmetic mean.
-
cumulative opportunities (or isochrones) for accessibility to specilized health care
SpecHC
and theatresTeatr
. Function syntax:Ai_cumulative(file_all, threshold, mc_max)
Application of temporal resolution: This script aggregates calculated accessibility measures using an arithmetic mean.
-
potential accessibility for accessibility to education (secondary schools,
Edu
), hospitalsHOS
and populationOBWOD
. The funciton uses negative exponential function:(mass*(exp(-beta*TravelTime)
.Application of temporal resolution: This script aggregates selected travel times using harmonic-based means (for details please consult Stępniak & Jacobs-Crisioni (2017).
Function syntax:
Ai_potential(file_mass, file_all, beta, mc_max)
where:
mc_max
number of iterations in case of simple random, hybrid, and constrained random walk sampling methods;file_all
defines a list of a given types of OD matrixes (for different types of destinations; e.g. HOS)file_mass
(relative or absolute) path to the file where the quantitative value of attractiveness of destinations is stored. The file should be two-column, in the first there should be ID (to be used while merging with destinations ID) and the second - value of attractiveness of a given destination, e.g.:- number of beds in a hospital ("Results/t00_data/HOS.csv")
- number of classes in a school ("Results/t00_data/EduLO.csv")
- number of population in a census track ("Results/t00_data/POP.csv")
beta
the value of beta parameter (applied value: 0.023105)
Each of sampling procedures is repeated an user-defined number of times (in case of the paper = 100).
Set of origin-destination matrixes stored in two folders:
f03_Aggregates
for measures without distance decay (proximity and cumulative opportunities measures)f03_Aggregates_Ai
for measures with distance decay (potential accessibility measure)
Set of csv
files stored in two folders (names of files depends on the type of destination):
-
t06_Results
files with aggregated values of:- MAPE, MAE, maximum difference
- values of Gini (stored in subfolder
Gini
)
-
t04_Temporary
files with disaggrated values (calculated separately for each of randomly selected scenarios):- MAPE (data used for Table 4 in the paper)
- MAE
- maxdiff
- Gini
Additionally, in the t04_Temporary
the script saves values of accessibility measures calculated for a systematic approach (one file for each of destination and time-window period; for the details please consult Data_description.pdf
file).
R02_TravelTime.R
compares precision of travel times' estimation using MAPE, MAE and maximum difference (maxdiff). The script uses all 4 sampling methods.
Set of OD matrixes stored in f03_Aggregates_Ai
folder (census track centroids as origin & departure points)
TravTime.csv
file stored in Results/t05_TTResults
folder which contains MAPE, MAE and maxdiff (maximum difference) indicators (MAPE values used for Table 3 in the paper)
R03_Comparison_TravelTime.R
prepares graphs which present the loose of precision of travel time estimation due to reduced temporal resolution.
TravTime.csv
file (stored in t05_TTResults
folder) which contains MAPE, MAE & maxdiff of travel times aggregated for different temporal resolutions and obtained using different sampling methods.
TravelTime_sampling.png
figure which compares the quality of different sampling methods. (Figure 3 in the paper)TravelTime_estimation.png
figure which presents the loose of precision (MAPE & MAE) of hybrid model in different 1-hour-long scenarios and their average. (Figure 5 in the paper)
R03_Comparison_TravelTime.R
prepares graphs which present the loose of precision of travel time estimation due to reduced temporal resolution.
Set of files (one per each destination) with MAPEs vales stored in t06_Results
folder.
Ai_Sampling.png
compares the quality of different sampling methods (Figure 4 in the paper).Ai_Hybrid.png
compares all measures calculated for particular destinations, using MAPE of hybrid model in different 1-hour-long scenarios (not used in the paper).Ai_Hybrid_complex.png
the same as above but added a zoom-in with excluded curves for cumulative opportunities (Figure 6 in the paper).Ai_H_scenarios.png
presents MAPE of hybrid model in different 1-hour-long scenarios, for different types of measures and destinations (Figure 7 in the annex)
R05_Gini_Ai.R
combines Gini coefficients stored in separate files (one for each destination and time-window) and save an output to xlsx file.
Set of files (one per each destination and time window) with Gini coefficients stored in t06_Results
folder.
Gini_table.xlsx
stored int07_Graphs
folder (data used for Table 5 in the paper)
Simple script used to draw a graph which presents the total number of vehicles in 1-hour-long periods of time (Figure 2 in the paper).
Selected GTFS files: stop_times.txt
, trips.txt
, calendar.txt
and routes.txt
stored in Results/p00_data/GTFS_Szczecin
folder.
Graph Plot_Freq.png
stored in Results/p07_Graphs/
folder (Figure 2 in the paper)
This document is created within the MSCA CAlCULUS project.
This project has received funding from the European Union's Horizon 2020 research and innovation Programme under the Marie Sklodowska-Curie Grant Agreement no. 749761.
The views and opinions expressed herein do not necessarily reflect those of the European Commission.
License for scripts: CC-BY-4.0