/dbt-duckdb-tutorial

This is a simple analytic project using DuckDB & dbt with air quality data.

Demo DuckDB & dbt project

This is a simple project using DuckDB & dbt. The repo contains two models based on the WHO air quality dataset that is hosted on a public S3 bucket as a parquet file. The dbt pipelines output two CSVs in output/ folder. While the bucket is public, you would be required to setup S3_ACCESS_KEY_ID S3_SECRET_ACCESS_KEY environment variable (can be dummy values) to run the pipeline.

Development

Install dependencies

This project use the dbt-duckdb adapter for DuckDB. You can install it by doing pip install dbt-duckdb. This include dbt, dbt-duckdb adapter and duckdb.

Using devcontainer

There's a devcontainer you can use for either local developement or through GitHub Codespace.

Running the pipeline

Inside the dbt project /dbt_demo, run dbt run