/deanslist

ETL job to bring Deanslist data into the data warehouse

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

deanslist

ETL job to bring DeansList data into the data warehouse

Dependencies:

Getting Started

Setup Environment

  1. Clone this repo
$ git clone https://github.com/kippnorcal/deanslist.git
  1. Install Pipenv
$ pip install pipenv
$ pipenv install
  1. Install Docker
  1. Create .env file with project secrets
DB_SERVER=
DB=
DB_USER=
DB_PWD=
DB_SCHEMA=

# Mailgun & email notification variables
MG_API_KEY=
MG_API_URL=
MG_DOMAIN=
SENDER_EMAIL=
RECIPIENT_EMAIL=

DOMAIN=deanslist domain
  1. Create DeansList_APIConnection database table. Refer to sql/DeansList_APIConnection.sql.

  2. Build Docker Image

$ docker build -t deanslist .

Running the Job

$ docker run --rm -it deanslist

Runtime parameters

Run the job for only certain schools (one or many). School names must match APIKeys table.

$ docker run --rm -it deanslist --schools "KIPP Bayview Academy" "KIPP Bridge Academy (Upper)"

By default, we get behaviors data for the current month. To backfill ONLY behavior data for a specified date range (ie. no other endpoints), use the following command. Note: for best performance, limit the date range to 1 month.

$ docker run --rm -it deanslist --behavior-backfill "2019-12-01" "2019-12-31"

Maintenance

  • If a new school starts using DeansList, then the table custom.DeansList_APIConnection needs to be updated. Set Active=True to pull records for the newly added school.
  • The connector can be turned off when school is out of session for the summer.
  • No other annual maintenance is required.