rittmananalytics/ra_data_warehouse

Use dbt sources for staging area

olivierdupuis opened this issue · 1 comments

The framework currently has macros that make direct calls to source tables. This setup doesn't take advantage of dbt sources, which facilitates the definition of where the source tables are, allows for source freshness tests and creates special objects for them in documentation.

Proposed changes

To dbt_project.yml

  • Adding a stitch_source variable

Create a sources.yml file within the schema repository of each source

version: 2

sources:
  - name: mailchimp
    database: stitch_database
    schema: mailchimp
    tables:
      - name: list_members
      - name: reports_email_activity
      - name: lists
      - name: campaigns

Changes to filter_etl_table macros

  • Only have 3 parameters that are passed: filter_stitch_table(source_name, source_table, unique_column)
  • Use dbt's source macro in the from statement: {{ source(source_name, source_table) }}

Changes to staging files

  • Call the filter_etl_table macros with the correct parameters: {{ filter_stitch_table(var('stitch_source'),var('stitch_clients_table'),'id') }}

┆Issue is synchronized with this Jira Task by Unito

Hi Olivier, this has now been done and is present in the most recent version of the repo