/dbt-dw

build dw with dbt

Primary LanguageTSQL

dbt build a datawarehouse using dimensional modeling

This is a dbt repo. In this repo we build a practical datawarehouse using Kimball dimensional model with dbt.

We use a transactional database, SQL Server, AdventureWorks2019 as our source. We extract and load data with an EL tool Airbyte.

Project Setup

You will need the following Tech Stack to following along with this project.

Using dbt we transform this data into dimensions and facts.

Source tables/views used from AdventureWorks datatabase

schemaname tablename type
source address table
source businessentityaddress table
source customer table
source salesorderdetail table
source salesorderheadersalesreason table
source salesorderheader table
source vw_countryregion view
source vw_product view
source vw_productcategory view
source vw_person view
source vw_store view
source vw_salesreason view
source vw_salesterritory view
source vw_productsubcategory view
source vw_stateprovince view
source vw_salesorderheader view

Dimensional Modeling 101

To understand Kimball’s approach to data modeling, we should begin by talking about the star schema. The star schema is a particular way of organizing data for analytical purposes. It consists of two types of tables:

  • A fact table, which acts as the primary table for the schema. A fact table contains the primary measurements, metrics, or ‘facts’ of a business process.

  • Many dimension tables associated with the fact table. Each dimension table contains ‘dimensions’ — that is, descriptive attributes of the fact table.

    image

Advantages of Dimensional Modelling

  • Dimensional data modeling enables users to easily access data through simple queries, reducing the time and effort required to retrieve and analyze data.

  • The simple structure of dimensional data modeling allows for faster query performance, particularly when compared to relational data models.

  • Dimensional data modeling allows for more flexible data analysis, as users can quickly and easily explore relationships between data.

  • Dimensional data modeling can improve data quality by reducing redundancy and inconsistencies in the data.

  • Dimensional data modeling uses simple, intuitive structures that are easy to understand, even for non-technical users. These dimensional tables 'surround' the fact table, which is where the name 'star schema' comes from.

    DBT Completed DAG

    image