otp-routing: A Shell repository from dfsnow

OTP Routing

This is a Docker container designed to calculate large-scale distance matrices for groups of Census tracts or blocks. It uses counties as a unit of work, taking a county FIPS code and the files generated by otp-resources as inputs and saving outputs to /resources/outputs/$GEOID/ inside the container (where $GEOID is the FIPS code of the county).

Inputs

The container takes the following inputs as Docker environmental variables, all arguments are required:

GEOID (the five-digit, 2010 FIPS code for a U.S. county)
TRAVEL_MODE (the type of travel mode within OTP, can be 'CAR', 'TRANSIT,WALK', or 'WALK', also determines which .pbf file to use)
TYPE (the type of matrix to create, can be 'TRACT' or 'BLOCK')
OVERWRITE_GRAPH (boolean for whether or not to overwrite OTP-created Graph.obj, set to 'TRUE' when switch between travel modes)
MAX_TRAVEL_TIME (the maximum travel time before cutoff, in seconds)
MAX_WALK_DIST (the maximum walking distance before cutoff, in meters)
CHUNKS (the number of chunks to divide the input file into, keep high for blocks and low for tracts)
MAX_THREADS (the maximum number of threads to process jobs with)

Each OTP matrix calculation requires 4 input files: a PBF of the relevant area, a CSV of origin locations, a CSV of destination locations, and zip file of any GTFS feeds in the buffered county.

The container will look for these input files in /resources/graphs/$GEOID. It will ingest the files exactly as they are outputted by otp-resources, simply mount the same /resources/graphs/ for both containers. An example of this setup is provided in submit_jobs_simple.sh.

Outputs

This container outputs a CSV distance matrix that is n * m long and 3 wide, where n is the number of origins and m is the number of destinations. Sample output:

origin,destination,minutes
17031010100,17031010201,19.37
17031010100,17031010202,10.35
17031010100,17031010300,10.77
17031010100,17031010400,13.33
17031010100,17031010501,11.03
17031010100,17031010502,13.95
17031010100,17031010503,16.35
17031010100,17031010600,18.35

Each CSV is bzip'd to save space (some block matrices can be very large). Output files are saved to /resources/outputs/$GEOID/, and each output file is named according to its $GEOID, $TYPE, and $TRANSIT_MODE.

Running Without Root

If you need to run this container but don't have the root privileges necessary to install Docker, try using udocker. There are only two changes required when using udocker:

Alias docker to the udocker executable
Manually create the directories that you plan to store resources in (udocker seems to have trouble creating new directories on the host)

Deleting Bad GTFS Feeds

blacklist.csv is a list of GTFS feeds that are improperly formatted and thus break OTP. You can delete them from every folder in the graphs directory by using find, e.g.

xargs -a blacklist.csv -I filename find ~/resources/graphs/ -name filename -delete

Extracting a Subset of Tracts or Blocks

You can use the example extract_from_bzip.sh script to extract a smaller segment of a larger file. The script is a quick one-liner that uses awk to match the GEOIDs specified in a filename of your choice. For example, the following code would extract the tracts specified in list_of_geoids.csv from the file 36061-output-TRACT-CAR.csv.bz2.