OldTO was a site that showcased historic photographs of Toronto by placing them on a map.
You can read more about it on the Sidewalk Labs Blog.
Here's a screen recording of what OldTO looked like (YouTube):
While the OldTO is no longer hosted by Sidewalk Labs, the source code is all available in this repo and it is possible to run it yourself. The instructions below describe how to do this.
OldTO begins with data from the Toronto Archives, which you can find
in data/images.ndjson
.
To place the images on a map ("geocode" them), we use a list of Toronto street names and a collection of regular expressions which look for addresses and cross-streets. We send these through the Google Maps Geocoding API to get latitudes and longitudes for the images. We also incorporate a set of points of interest for popular locations like the CN Tower or City Hall.
Setup dependencies (on a Mac):
brew install coreutils csvkit
OldTO requires Python 3. Once you have this set up, you can install the Python dependencies in a virtual environment via:
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
The data for the OldTO site is served via a Python API server. Start by running this:
source venv/bin/activate
oldtoronto/devserver.py data/images.geojson
If you've generated geocodes in a different location, change data/images.geojson
to that.
The OldTO site lives in oldto-site
. In order to build it, you'll need the
yarn package manager. Instructions on setting that up at https://yarnpkg.com/.
You'll also need to get a Google Maps API key. Once you've done this,
set the enviroment variable GMAPS_API_KEY
to your own api key:
export GMAPS_API_KEY=...
Webpack needs this to build the site when you run yarn webpack
. You can
spin it up by running it locally using http-server
(install with
npm install -g http-server
).
cd oldto-site
yarn # install dependencies
yarn webpack # bundle JavaScript and build site
cd dist
http-server --proxy=http://localhost:8081
Then visit http://localhost:8080/ to browse the site.
To iterate on the site, use yarn watch
:
cd oldto-site
yarn watch &
cd dist
http-server --proxy=http://localhost:8081
First, add your Google Maps API key to the file oldtoronto/settings.py
.
Next, you'll first want to download cached geocodes from here.
Unzip this file into cache/maps.googleapis.com
. This will make the geocoding
pipeline run faster and more consistently than geocoding from scratch.
With this in place, you can update images.geojson
by running:
make
Note, to run the makefile on an OSX machine you will probably want to install md5sum, which can be done by running:
brew update && brew install md5sha1sum
Before sending out a PR with geocoding changes, you'll want to run a diff to evaluate the change.
For a quick check, you can operate on a 5% sample and diff that against master
:
oldtoronto/geocode.py --sample 0.05 --output /tmp/geocode_results.new.5pct.json
oldtoronto/diff_geocodes.py --sample 0.05 /tmp/geocode_results.new.5pct.json
To calculate metrics using truth data (must have jq installed):
grep -E "$(jq '.features[] | .id' data/truth.gtjson | sed s/\"//g | paste -s -d '|' )" data/images.ndjson > data/test.images.ndjson
oldtoronto/geocode.py --input data/test.images.ndjson
oldtoronto/generate_geojson.py --geocode_results data/test.images.ndjson --output data/test.images.geojson
oldtoronto/calculate_metrics.py --truth_data data/truth.gtjson --computed_data data/test.images.geojson
To debug a specific image ID, run something like:
oldtoronto/geocode.py --ids 520805 --output /tmp/geocode.json && \
cat oldtoronto/geocode.py.log | grep -v regex
If you want to understand the differences between two images.geojson
files, you can
use the diff_geojson.py
script. This file will create a series of .geojson
files
showing differences between an A and B GeoJSON. This is useful for using with the
data collected to the corrections google forms. Use those along with the
check_changes_using_*
scripts.
Once you're ready to send the PR, run a diff on the full geocodes.
To update the list of street names, run:
oldtoronto/extract_noun_phrases.py streets 1 > /tmp/streets+examples.txt && \
cut -f2 /tmp/streets+examples.txt | sed 1d | sort > data/streets.txt