transportation-data-deploy
This repository houses a deployment framework for Austin Transportation's ETL scripting tasks. It uses Python, Docker and cron to schedule and monitor each task. The framework's core components are:
Builder (build.sh)
The builder generates a docker run
command for each task. It handles the syntax for passing tasks to the launcher and packages each command as a shell script which can be installed as a cron job.
Deployer (deploy.sh)
The deployer installs each task as a cron job on a Linux host.
Launcher (launch.py)
The launcher acts as a wrapper for each ETL script and manages logging, email notifications, and job registration for incremental data loading.
Requirements
-
a Linux host with Python v2.7+ installed
Installation
-
Clone this repo on a Linux host:
git clone https://github.com/cityofaustin/transportation-data-deploy
. -
Define script and
docker run
parameters inconfig/scripts.yml
andconfig/docker.yml
respectively. -
$ bash build.sh
to generate shell scripts and cron entries. -
$ bash deploy.sh
to install crontab on host.
License
As a work of the City of Austin, this project is in the public domain within the United States.
Additionally, we waive copyright and related rights of the work worldwide through the CC0 1.0 Universal public domain dedication.