
We're working with a largeish dataset. It's comprised of all the Toronto bike share trips from 2017 and 2018, from here. This repo contains the data dump itself, some scripts that load it into a postgres database (using Docker Compose), and a set of exercises around it.

Getting Started


You will need Docker / Docker Compose for this to work

Initializing the Database

The first thing you're going to want to do is iniitalize the database


This takes about 10 minutes to run, so do it first!

Accessing the database

We can access the database in one of two ways, we can either use [] or access it via a console.

For pgweb try


Note if that doesn't work you can just run

docker-compose up -d

and open http://localhost:8081 in your browser!

For console try



  1. Take good notes on the queries you're running and what you're seeing!
  2. Don't forget to add LIMIT 10 to your exploratory queries so you don't overload the database. Take it off when you're ready to go!
  3. If you want to make a backup, try
CREATE TABLE trips_backup AS 
TABLE trips;

Starting Over

If you want to start over at any point you can run the ./bin/init script at any point (remember that it will take a while).

If you really mess up and want to completely get rid of everything and start over, you can

docker-compose down
docker-compose rm
docker volume rm toronto-bikeshare-data_db-data