/osm_can_bics

A national bicycle infrastructure dataset from OpenStreetMap with consistent labels.

Primary LanguageHTML

OSM Can-BICS

OSM Can-BICS classifies OpenStreetMap (OSM) to create a network dataset of bicycle facilities, classified according to the Canadian Bikeway Comfort and Safety Classification System (Can-BICS).

Bicycle facilities across Canada

Table of contents

Introduction

To create bicycling facility datasets, many studies assemble open data provided by cities. However, compiling open data sources leads to inconsistent labeling of facilities across cities, and many studies do not provide a statement of accuracy. OpenStreetMap provides a more collaborative approach through a central repository and benefits from community contributors with on-the-ground knowledge. OSM Can-BICS provides a research ready dataset complete with measures of bias and accuracy in 15 test cities, and a well-defined path for future change detection studies.

We developed OSM-Can-BICS using the following steps:

  1. Process test cities to develop the classification and perform an accuracy assessment.
  2. Create the national network dataset.

The code and dataset are under development. Our preliminary national dataset is available from ArcGIS online.

Technology

Test cities

We selected 15 cities spanning across Canada, stratified by population (small < 50,000; medium 50,000 to 500,000; and large > 500,000) to collect reference data and perform an accuracy assessment. Both the reference data and OSM data were acquired in the summer of 2020.

Reference data

We collected more than 2000 reference points using street level imagery, aerial imagery, and other reference data where available (e.g. newspaper articles about new projects, pdf maps provided by cities, and local knowledge).

We collected a stratified random sample at the following rates:

  • 1 sample per 10 km for low and medium comfort infrastructure (painted lanes and multi-use paths).
  • 1 sample per km for high comfort infrastructure (bike paths, cycle tracks, and local street bikeways) (high comfort facilities are relatively uncommon and are important for cycling safety and preference).
  • 1 sample per km for facilities that are on OSM, but not open data (to provide greater scrutiny).

The reference data were randomly split into training (70%) and testing (30%). Four interpreters used the interpretation guidelines to assign Can-BICS labels. Interpreters marked locations for later review, where necessary.

OSM Data

OSM data (OpenStreetMap Contributors, 2021) were downloaded for the 15 test cities in the summer of 2020 (to match the reference data) using the query highway = * in the R package osmdata.

Data acquisition for processing

Two additional steps are needed to download large files:

  1. Download the reference and OSM data and extract to data/test_cities/ (65 MB).
  2. Download landcover data for Canada (2 GB), store and unpack where convenient, and update the path in /code/paths_and_variables.R

Processing

  1. /code/test_cities/classify.R classifies OSM data for the 15 test cities.
  • functions to classify OSM data are located in the file /code/Can_BICS_OSM_classify.R. This file can be modified for different research goals.
  • supporting functions are located in /code/Can_BICS_OSM_functions.R
  1. /code/test_cities/overall_accuracy.RMD generates an accuracy assessment. Requires that classify.R has been run.

Classification algorithm Figure 1 Classification algorithm.

National dataset

The national dataset uses data from Geofabrk and PostGIS for storage, due to the large extent. Processing is done province-by-province, and the processed data is combined into the national dataset.

Database setup

Due to the volume of data, we use PostGIS to store and manage the data. These steps were tested on Windows 10, but should be adaptable to other platforms.

  1. Install PostGIS.
  2. Install OSM2PSQL
  3. Create a default account called "postgres" with no password when executed from a local environment (careful in network accessible environments!).

Data acquisition

Three additional steps are needed to download large files:

  1. Download the OSM data by running /code/national/download_data.R.
  2. Download landcover data for Canada (2 GB), store and unpack where convenient, and update the path in /code/paths_and_variables.R
  3. Download Canada Census Subdivision Boundaries, store and unpack where convenient, and update the path in /code/paths_and_variables.R

Processing

  1. /code/national/classify.R classifies OSM data for the provinces.
  • functions to classify OSM data are located in the file /code/Can_BICS_OSM_classify.R.
  • supporting functions are located in /code/Can_BICS_OSM_functions.R
  • you can subset the data and run multiple simultaneous/parallel classifications (for efficiency).
  • In testing, processing took approximately 1 weekend to run for the national dataset (start on Friday afternoon, ready for Monday morning).
  1. /code/national/export_data.R exports data into shapefile and json formats.
  2. /code/national/reporting.RMD generates summary statistics. Requires that /code/national/classify.R and /code/national/export_data.R have been run.