/datafold-dbt-ci

Learn how to create simple and advanced CI pipelines with GitHub Actions in a toy dbt project to automate data validation

About this project

This repo demonstrates how you can build your first CI pipeline with GitHub Actions for a simple dbt project. By the end of this project, you will see a CI pipeline that will, upon a new PR being opened in your dbt project, accomplish three things:

  1. Ensure your dbt models compile and build properly
  2. Test your dbt models with the test you’ve established for them
  3. Run a SQL linter against your code changes

Article and video tutorial

The article goes into more detail on each step here.

I also walk through the same tutorial in this 5 minute Loom:

Building your first CI pipeline for your dbt project - Watch Video

What's in this repo?

It uses seeds that includes fake raw data from a fictional app, via dbt lab's jaffle shop test project. You can also download the data directly from here.

The best way to learn how to create your first GitHub Actions workflow is to fork this repository and follow our tutorial (link to be added upon publication).

You can take a look at what's in our super simple workflow here called Our first dbt PR job

By the end of the tutorial, you will have run your first workflow!

An example of how CI works

Imagine you would like to only analyze orders made in March 2018 (the full dataset, which you can see in the orders.csv file, contains orders between January to April 2018).

Here's what a CI workflow should look like

  1. We'll create a new branch to make our change in. In your terminal:
git checkout -b "update-fct-orders"
  1. Then, update the fct_orders.sql file to add a filter:
orders as (
    -- Filtering orders to March 2018
    select * from {{ ref('stg_orders')}}
    where order_date >= '2018-03-01' and order_date <= '2018-03-31'
),
  1. Commit the change to your respository and open a new PR. Here's the open PR from this repository.

  2. Wait for our GitHub Actions workflow, that was automatically triggered with the opened PR, to finish running. Success! You can now merge to main with the confidence that our modified dbt model did not break anything and our code underwent linting.

Resources

Learn more about: