/airflow

Experimenting with Apache Airflow

Apache Airflow 2.0 Guide

Python Virtual Environment Setup and Initialization

Install python3 on OSX

brew install python3

Create and activate Python virtual environment

python3 -m venv airflowsandbox
source airflowsandbox/bin/activate
# Assumes Python 3.8.5 installed
# Assumes python3-pip 20.2.4
pip install wheel
pip install apache-airflow==2.0.0 --constraint https://gist.githubusercontent.com/cjtravis/8c9c136e3cd20e513c9c253a7275f8fc/raw/5da51f9fe99266562723fdfb3e11d3b6ac727711/constraint.txt

Initialize database

Default metadata database is local SQLite database. Postgres or MySQL can be configured by modifying airflow.cfg file.

export AIRFLOW_HOME=~/git/cjtravis/airflow/airflow
airflow db init

This will populate ~/git/cjtravis/airflow/airflow with airflow configuration files.

Create Airflow Admin User

airflow users create \
    --username admin \
    --firstname CJ \
    --lastname Travis \
    --role Admin \
    --email cj@cjtravis.com \
    --password airflow

Starting Airflow Services

Webserver

airflow webserver 

The webserver should be accessible at http://localhost:8080.

Scheduler

airflow scheduler

DAGs

Create dags directory

mkdir AIRFLOW_HOME/dags