dbt Business Intelligence

This project shows how to structure dbt to enable analytics in a business intelligence (BI) tool for reporting and dashboarding without the need for defining metadata in the BI tool.

The project has been tested on both dbt Core and dbt Cloud versions 1.0+. The project is also database agnostic and has been tested with the Postgres, BigQuery, Snowflake, and Redshift adapters.

Overview

dbt Concepts

Database agnostic - how to use macros and packages to create models that work across database platforms
Multi-dimensional Modeling - how to create multi-dimensional models using the Kimball methodology
Date Dimension - build a database agnostic date dimension using the dbt-date package.
Metrics - how to define Metrics
Metrics Package - how to use the dbt_metrics package to generate queries.
Meta - using the meta config for downstream BI tools to handle joins, formatting, aliases, model and column order, hide/display columns and models, and more
Tests - how to use singular and generic tests, including relationships tests for referential integrity
Packages - using packages such as dbt-date and dbt-utils
Jinja & Macros - using Jinja & Macros to create re-usable code
Seeds - use Seeds to load source data into the data warehouse
Exposures - document downstream model dependencies, such as reports and dashboards, using Exposures
Custom Schemas - how to use Custom Schemas to organize models (e.g. staging tables)
Documentation - documenting dbt models for optimized downstream discovery in BI tools
Sources - defining Sources and using the source freshness features
Snapshots - use Snapshots to maintain a history of record changes in a table over time
Pre-hook & Post-hook - use post-hooks to run SQL after a model is built. For example, to create indexes, primary/foreign keys, grant permissions.
Best Practices - follow dbt Best Practices

BI Concepts

Metadata-less BI - bypass defining metadata models at the BI semantic layer
Lineage - instantly see upstream model and source dependencies with the ability to view and run upstream SQL
Joins - entity relationships defined by dbt tests relationships or model meta
Formatting - define universal formatting for number, string, date and other data types
Data Freshness - from reports and models, view model refresh date/time and source data freshness
Metrics - use dbt Metrics for complex calculations and aggregation
Natural Language Query (NLQ) - define model meta.synonyms (a.k.a aliases) to be used by natural language query or generic searching
Report dependencies - define model dependencies for reports and dashboards using Exposures
Documentation - create a full data dictionary with source controlled definitions

Getting started

If you're just getting started learning dbt, then you may want to look at Getting Started with dbt Core or dbt Cloud first. The concepts in this tutorial may be a bit more difficult.

Choose a database

BigQuery

Snowflake - TODO issues with server name and location

Redshift

Postgres - TODO show how to set up free hosted Postgres database on Heroku

dbt Client

run the following commands:

First run:

dbt deps
dbt seed

Schedule:

dbt run
dbt source freshness

Docs:

dbt docs generate
dbt docs serve

dbt Cloud

instructions for cloud accounts

ataft/dbt-business-intelligence-fork