duckdb/dbt-duckdb

Question - DBT, DuckDB and Iceberg

dbettin opened this issue · 1 comments

Sorry if this is the wrong forum to receive some guidance. Please advise if this should be directed somewhere else.

I would like to use dbt-duckdb to manage the different layers (think medallion architecture) of a data lakehouse using Iceberg tables. Essentially, we would like to transform our different layers of the lakehouse with dbt-duckdb.

Is this possible? Is the plugin for querying only? Or can it handle transformations and create new iceberg tables, etc?

Thanks for any insight!

jwills commented

Hey @dbettin, no problem at all. So a couple of things here-- dbt-duckdb relies on pyiceberg for interacting with Iceberg tables, and my latest understanding is that pyiceberg does not yet support writing to existing Iceberg tables: https://py.iceberg.apache.org/feature-support/

...but that support for that is in the works: apache/iceberg-python#23

...and as soon as pyiceberg can support writes, we will get to work over here at supporting them as well.

Separately, the DuckDB project itself has an iceberg extension in the works: https://duckdb.org/docs/extensions/iceberg.html which currently does read support. I'm sure that write support is in the works as well, but I don't know what the plan is there.