dbt Training Project

This repository consists of a dbt project that transforms raw data sources into clear, formatted models for Analytics.

Sources:

All source data is loaded to the RAW database.

  • tech_store - An internal company database
  • payment_app - A third party payment processing application

Target Environments:

All transformed data models are deployed to the ANALYTICS_### database.

  • Development
    • Schema: DBT_JDOE
      • One per developer (first initial, last name)
  • Production
    • Schema: STAGING
      • 1:1 with each soure-system table
    • Schema: MARTS
      • Fully transformed and joined models ready for analytics

How to Get Started?

  • Confirm both Python & Git are on local machine (if not, download them)
    • Run python --version or python3 --version
    • Run git --version
    • Set default Git values:
      • user.name=[user-name]
      • user.email=email@domain.com
      • init.defaultbranch=main
      • git config --global --add push.default current
      • git config --global push.autoSetupRemote true
      • git config --global pull.rebase false
  • Download Visual Studio Code & open the new GitHub/ directory
  • Create a Python virtual environment to isolate project dependencies
    1. Right-Click under GitHub/ and select "Open Integrated Terminal"
    2. Run python3 -m venv dbt-env to create virtual environment
    3. Run source dbt-env/bin/activate to activate & use the virtual environment
  • Install dbt locally (inside virtual environment) using the proper adapter
    • Run pip install dbt-[adapter]
  • Clone this repository within the GitHub/ folder
    • Run git clone https://github.com/[owner]/[repo].git
  • Pull latest repository changes on the main branch
    • Run git pull
  • Identify the profiles.yml file on your local machine
    • Local File Path: ~/.dbt/profiles.yml
      • Will be hidden by default on Mac/Linux. Press CMD + SHIFT + . to reveal.
    • Copy/Paste contents of _project_docs/sample-profiles.yml
      • Update your dataset accordingly
  • Validate successful database connection
    • Run cd dbt to switch into dbt project directory
    • Run dbt debug to validate dbt can connect
  • Add remote origin
    • Run git remote add origin https://github.com/[USERNAME]/[REPO].git
  • Create a new branch
    • git branch [branch-name]
  • Checkout branch
    • git checkout [branch-name]
  • Download dbt packages
    • dbt deps
  • Start developing!
    • IMPORTANT - All changes should follow the team Style Guide
    • You'll need to reactivate your Virtual Environment each time by running source dbt-env/bin/activate from GitHub/ directory
      • Click here to learn more about using virtual environments w/ dbt, including ways to alias this acticate command.

Contributors

  • John Doe (Developer)
  • Jane Doe (Developer)

Resources:

  • Learn more about dbt in the docs
  • Check out Discourse for commonly asked questions and answers
  • Join the chat on Slack for live discussions and support
  • Check out the blog for the latest news on dbt's development and best practices