/prism

Prism is a tool to extract public transport data from OpenStreetMap. It allows you to generate a GTFS feed or GeoCSV extracts from an OSM file.

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

Prism

Prism is a tool to extract public transport data from OpenStreetMap. It allows you to generate a GTFS feed or GeoCSV extracts from an OSM file.

prism logo

Use

Once installed, prism can be used as follow:

poetry run python prism/cli.py osm_file -gtfs

osm_file is an OSM extract in osm.xml or osm.pbf format. You can get it for instance from Geofabrik or OSM France.

If the area covered by your data is large, it is recommanded for performance concerns to filter it to only keep relevant objects, for instance with osmium: osmium tags-filter data.osm.pbf type=route_master type=route -o pt_data.osm.pbf

An OSM extract example can be found in this repo (tests/data/osm/abidjan_test_data.osm.pbf)

NB: Please respect OSM licence. If you use or distribute the data generated by prism you need to visibly credit OpenStreetMap contributors. Other rights and responsibilities may apply depending on your usage, read more on OpenStreetMap official website.

An optional config file can be used with the -c param to tune the OSM extraction and conversion behaviour. It should be a valid json file.

You also need to specify one or more output formats:

  • -gtfs : to create a GTFS file
  • -csv : to extract transport objets in GeoCSV files (useful for debugging purpose and dataviz applications)

Here is a more detailled example with more parameters: poetry run python prism/cli.py tests/data/osm/abidjan_test_data.osm.pbf --outdir out/ --loglevel=DEBUG --config example_config.json -csv -gtfs

Install

TODO - For now, only dev install available: clone this repo and use poetry with poetry install.

How does it work ?

Extraction

First, OSM relations with type=route_master and type=route tags are extracted. The config file can be used to filter in specific network, operator or public transport mode.

Then the OSM objects with platform role and all the ways are extracted. The coordinates constituting the tracks are then processed to build a continuous path for each route.

Shit in, Shit out: Creating good and usable transport data is not easy. If you contribute to OSM, we recommend using quality assurance tools such as Jungle Bus validation ruleset for JOSM to check and improve the quality of the data you create. The difference will be obvious in your prism extracts ;)

Conversion

Each output format (csv / gtfs) comes with specifics conversions rules.

CSV

The csv output format creates a zip with several csv files.

  • OSM route_master objects are extracted in the lines.csv file
  • OSM route objets are extracted in the routes.csv file
  • OSM stops (route members with platform role) are extracted in the stop_points.csv
  • Some additional files comes along, with the links between objects (which stops are belonging to which routes, etc)

A few relevant tags are chosen for each objects and added as dedicated columns in the csv files.

The stops files have latitude and longitude columns. The routes and lines files have a shape column that contains the computed track of the route, in WKT format. Both spreadsheet software or GIS software can be used to open these files.

GTFS

GTFS

prism will create a GTFS feed with the following content:

  • OSM route_master objects are extracted in the routes.txt file
  • The network or operator tags of the route_master objects are used to create the agency.txt file, depending on your config
  • OSM stops are extracted in the stops.txt file
  • OSM route objets are used to create the trips of each GTFS route
  • The hours tags (interval, opening_hours and interval:conditional) are used to define the GTFS services (calendar.txt), the stop_times and the frequencies.
  • The tracks of each OSM route is used to create the shapes.txt file

Using prism config file, you can set some default values, choose how to interpolate stop_times within a trip and define GTFS special values (such as feed_info.txt content).

Credits

This project has been developed by the Jungle Bus team. The official source code repository is at https://github.com/Jungle-Bus/prism

Jungle Bus Logo

The code in this repository is under the GPL-3.0 license.

This project uses OpenStreetMap data, licensed under the ODbL by the OpenStreetMap Foundation. You need to visibly credit OpenStreetMap and its contributors if you use or distribute the data generated from this project. Read more on OpenStreetMap official website.

Want to know more about OSM and GTFS ? Check out this infographics

This project is heavily inspired by the following previous projects:

Big thanks to their open source contributors 💖