This is the companion code to demonstrate a working example built off my article of the same name. It is a simple template/example of a working ELT project where we take some data, the Chinook dataset, and produce some analysis tables from it, along with accompanying data/visual exports.
Since this is a project template, it assumes you'll be copying down this
folder, editing SQL, and running commands via make
yourself. This comes with two prerequisites:
- Python >= 3.8
make
- usually distributed as GNU Make
Linux/macOS usually have both of these readily available. On Windows, you will
need either the Windows Subsystem for Linux
(WSL) (recommended), or
MinGW with git
and make
installed.
To get the project locally, simply clone it:
git clone https://github.com/renzmann/chinook-make-dag && cd chinook-make-dag
If you are on linux, macOS, or WSL, use make install
from the top level of this repository.
This will install poetry
to ~/.local/bin/poetry
, and use it to install this project along with
all its python dependencies.
make install
If you are not on linux/macOS/WSL, I still recommend installing poetry yourself, and using it to install the project, since it takes care of all the virtual environment overhead for you.
poetry install
If you do not want to use poetry, you can use a
recent version of pip
to install in editable mode:
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip wheel
pip install -e .
All of the commands below are prefixed with poetry run
- if you did not
install via poetry, or have already run the poetry shell
command to activate
your poetry environment, do not include the poetry run
part:
# Installed via poetry and NOT in poetry shell:
poetry run make tables
# In poetry shell or manual virtual environment installation
make tables
make tables
will produce new tables in data/analysis.db
for each SQL file in
the sql/
directory. A couple examples are provided for hard artifacts:
poetry run make customer_sales.png
will produce a bar chart in thetarget/
directory with total sales of US customers within each state.poetry run make customer_sales.csv
produces a CSV with the data this chart uses.
If at any point you want to refresh your target/
or data/
directories, you
can use poetry run make clean
.