data-diff is an open source package that you can use to see the impact of your dbt code changes on your dbt models as you code.
👀 Watch 4-min demo video here
Install data-diff
Install data-diff
with the command that is specific to the database you use with dbt.
pip install data-diff 'data-diff[snowflake,dbt]' -U
pip install data-diff 'data-diff[dbt]' google-cloud-bigquery -U
pip install data-diff 'data-diff[redshift,dbt]' -U
pip install data-diff 'data-diff[postgres,dbt]' -U
pip install data-diff 'data-diff[databricks,dbt]' -U
pip install data-diff 'data-diff[duckdb,dbt]' -U
Update a few lines in your dbt_project.yml
.
#dbt_project.yml
vars:
data_diff:
prod_database: my_database
prod_schema: my_default_schema
Run your first data diff!
dbt run && data-diff --dbt
We recommend you get started by walking through our simple setup instructions which contain examples and details.
Please reach out on the dbt Slack in #tools-datafold if you have any trouble whatsoever getting started!
Check out our documentation if you're looking to compare data across databases (for example, between Postgres and Snowflake).
We thank everyone who contributed so far!
This project is licensed under the terms of the MIT License.