This open source project is a macroscopic passenger transport model for the region of Germany. It supports research aimed at designing an integrated, renewable energy system with mobility behaviour insights. The reference publication can be found here: https://doi.org/10.1186/s12544-022-00568-9
It uses the quetzal transport modelling framework: https://github.com/systragroup/quetzal
The method is oriented towards classical four-step transport modelling with emphasis on the demand side.
The directory structure is as follows:
quetzal_germany/
-- input/
-- input_static/
-- model/
---- base/
---- scenarioX/
-- notebooks/
---- log/
-- output/
---- base/
---- scenarioX/
While input and output data as well as (temporary) model files are stored in seperate folders, Jupyter Notebooks contain all data management and modelling. Briefly, they are structured as follows (X
as wildcard):
prep1X
: Generation of transport demand zones and all transport networks in high resolutionprep2X
: Aggregation of PT network graph and connection to transport demand sources and sinksprep3X
: Calculation of shortest paths and enrichment with performance attributes for PT and cars, respectivelyprep4X
: Data preparation for generation and destination choice modelscalX
: Generation of calibration dataset and estimation of demand model parameters (only applicable with access to calibration data (see below))model_generation_X
: Trip generation choice for non-compulsory trips either from exogenous data (MiD2017) or endogenously with a logit modelmodel_destination
: Destination choice for non-compulsory tripsmodel_volumes_X
: Generation of OD matrix either from exogenous data (VP2030) or endogenously from previous stepsmodel_mode
: Mode choice and calculation of composite cost for generation and destination choicemodel_assignment
: Route assignment and results validationmodel_inner-zonal
: Calculation of transport system indicators for inner-zonal trafficpost_X
: Calculation of emissions or energy demand for the entire passenger transport sector00_launcher
: Automatically runs all preparation and modelling steps in order00_test_environment
: Run it to see whether your virtual environment is properly set upval_X
: Various validation notebooks for model results
All scenario parameters are saved in the input/parameters.xls
file. Other mallable input files are located in the same folder, while unchanged input data sits in input_static/
.
- Create a virtual environment for quetzal models. Choose one of these methods:
- Either clone the quetzal package into a local folder and create a virtual environment as described here1: https://github.com/systragroup/quetzal
- Or create a virtual environment manually with all dependencies using conda:
- Create an environment with the desired python version, e.g.:
conda create -n quetzal python=3.10
- Activate this environemnt (
conda activate quetzal
) - Install neccesary dependencies:
conda install -c conda-forge pandas<2.0 geopandas contextily osmnx geopy rtree notebook matplotlib xlsxwriter cython numba scikit-learn scipy xlrd tqdm ray-default pytables sqlalchemy openpyxl==3.0.7
- Install quetzal dependencies that are not available on conda for python > 3.6:
pip install simpledbf biogeme
- Clone the quetzal repository to a desired location (
git clone https://github.com/systragroup/quetzal
) and install it as development versionpip install -e quetzal/
- Create an environment with the desired python version, e.g.:
- Activate your quetzal environment, if not done yet
- Clone this repository into a local folder: In your terminal, navigate to the position where you want to store the code. Type
git clone <this repo's URL>
. Navigate into the folderquetzal_germany
. - Download static input files from Zenodo2 into a folder named
input_static/
within thequetzal_germany
repository (see directory structure): - Open the local project in Jupyter Notebook (in your terminal type
jupyter notebook
) and start running the notebooks - OPTIONAL Install the car ownership module: Download the module from Zenodo into a folder named
car_ownership
, just like done withinput_static/
:
You can test your virtual environment by running the 00_test_environment
notebook. Read potential error messages and install concerning packages using conda, or refer to the quetzal issues page to see if this error has occured before.
You can execute your first model run from the 00_launcher
notebook by running only prep10
, prep3X
and model_X
notebooks in the order provided there. Note: If you want to run the model on a laptop with at least 8GB RAM, you should sparsify the OD set or use the NUTS3-level zoning system. Please read the cooresponding section in 00_launcher
. Otherwise, you'll need 80GB RAM.
This repository (together with static input data) contains road and public transport (incl. air) networks ready for simulation, and estimation results for the demand models. It also includes OSM data from prep4X
notebooks. Thus, you can simply generate the level-of-service (LoS)-shortest-paths-stack and then run the classic transport modelling steps. However, the order shifts, as you need the composite cost from mode choice in generation and destination choice steps (see order in 00_launcher
).
Running notebooks with the 00_launcher
generates log files for printed output and error messages, respectively. These can be found under notebooks/log/
.
You can adjust all assumptions in the parameters.xls
file, if you want to simulate an alternative transport system (see Scenarios).
Detailed descriptions what the notebooks do are to be found as comments. Briefly: Your StepModel object (always abbreviated with sm
) is where the magic happens. It saves all tables as attributes (pandas DataFrame
s) and provides all transport modelling specific functions from the quetzal library. Quetzal provides wrapper function for classic steps in aggregated transport modelling (trip generation, assignment, etc.), which execute a set of more specific functions. Due to a higher degree of customisation, this model uses quetzal's specific functions in many places.
Results of the transport model are computed in model_assignment
. If you have access to validation data, this notebook will also visualise validation plots. Files and figures are saved under the respective scenario name in output
.
Some relevant results are aggregated numbers that are printed in the corresponding model step (e.g. average yearly mileage per car in model_assignment
). These statements can be found in the output log file under /notebooks/log/out_<scenario name>_<notebook name>.txt
You can define own scenarios "the quetzal way": Open the parameters.xls
file and add a new column with your scenario name. Name it under "general/description" and refer to "base" as a "general/parent" scenario. All values, which you don't change in your new column are taken from the parent column.
You can now adjust parameters and run the model with new values. To do so, either use the 00_launcher
by typing your scenario name (column name in parameters.xls
) in the list of scenarios (fourth cell). All scenario names in this list will be executed in parallel. The other option is running the notebooks manually and defining the variable scenario
to your name (very first cell).
If you installed the car ownership module, you can compute changed car availabilities by running the correspondingly named notebook from the launcher (or manually). Further details are given in the notebook and in the Readme of the car ownership module.
Notebook prep10
creates the four step model (sm
) with a zones table that you specify. By default, it contains all NUTS3 zones of Germany, but you can limit it to the desired region or refine it with higher resolution data. The disaggregated notebook uses an aggregation of "Gemeindeverband"-zones, which constitute the default model.
Notebooks prep11
to prep14
create road and PT networks from OpenStreetMap and German-wide GTFS feeds, respectively. They will be saved in sm.road_links
/sm.road_nodes
and sm.links
/sm.nodes
, respectively. Additionally, a list sm.pt_route_types
is created. Make sure you uncomment the cell where you spatially restrict the network graph, if you want a smaller region. Notebook prep15
creates distances from all population points in the latest census to your PT stops (make sure to spatially restrict this one too). This data is used to parametrise PT access and egress links between zone's demand centroids and transport networks.
Notebooks prep2X
aggregate your network and create access/egress links between zones' demand centroids and the PT stops or road nodes, respectively. There are two methods used for PT network aggregation, which is necessary in order to reduce computation time for path finders and all other methods:
- Clustering short-distance stops
- Aggregation of PT network to relevant trips and stops with simultaneous connection to zone centroids (size and quality of the network depend on your definition of 'relevant')
- Subsequently, the road network gets connected
The rest works straight-forward with the notebooks' comments and should work for every self-defined region with minor adjustments. At the end of each notebook in the 'save' cell(s) you find all DataFrames (as sm
's attributes) that will be relevant in later steps. One additional attribute is always present: sm.epsg
which defines the coordinate reference system.
This repository together with externally hosted data packages contains all openly licensed data sources which are necessary for transport modelling in Germany.
Though, for estimating calibration parameters anew (beyond the those given in input/
), you need a German-wide mobility survey with trips on "Gemeindeverband"-level: "Mobilität in Deutschland 2017 B2" (MiD2017)
Optionally, you can generate the origin destination matrix exogenously, using origin destination matrices from the underlying model of the German federal governments transport study "Bundesverkehrswegeplan 2030".
You can apply for access to both data sets using the national Clearing House Transport order form . All csv
and xlsx
data tables go into the folder input/transport_demand
, which added to the .gitignore
.
Footnotes
-
If you face problems importing geopandas, consider uninstalling package
rtree
and reinstalling a version up to 0.9.3 (conda install -c conda-forge rtree=0.9.3
) or browsing through quetzal issues for discussions on installation. ↩ -
Large input data files are not hosted in this very repository, as they require different handling in git and licensing. ↩