duckdb/duckdb_iceberg

Iceberg REST Catalog Support

randypitcherii opened this issue ยท 10 comments

Hey, team!

Very excited about the duckdb v0.9 support for iceberg!

I currently use a rest catalog for my iceberg tables and was hoping to be able to wire up duckdb to that rather than point it to the actual underlying data/metadata files.

If this is available, I'd love to use it -- otherwise, I'd be happy to jump in and start coding if this feature is new.

Thanks!

Hi @randypitcherii

Thanks for your interest! The iceberg extension is currently in a quite early stage. The REST catalog is not yet supported, so we are definitely interested in your help there! Feel free to reach out to me through the DuckDB discord for a chat!

Ok, no worries.

I'm thinking I'll chat with the rest catalog through python then get the details to my ๐Ÿฆ† db programatically.

I'll see you on the discord!!! Thanks!

Good morning @samansmink , is there any plan to support iceberg catalogs in general (not only REST) in the near future?

Thanks in advance.

Hey @thinkORo! I would love to, but I'm a bit low on time currently. In general i would say we would like to support the most used catalogs at some point, but I can not give any timeline here at the moment. If you are interested in contributing, I'm happy to help out though

Hi @samansmink ,

Unfortunately, I'm only really good at Data Management and Data Analytics. And Python. Therefore, I am only a very limited support in contributing to DuckDB.

But: If I can do something to increase the prioritization or support you elsewhere to give you more time for such an (really important, at least for me) implementation, I am happy to do so.

I have a framework in place for this if #51 gets merged, see the notes about the REST/Nessie catalog.

It should just be a few more lines of work to perform the HTTP request.

Up! Any updates on this?

Not yet -- been working on other things but will return to this soon.

please update once it is implemented. Really excited to see duckdb support to REST catalog in iceberg

While the combination of DuckDB <> PyArrow <> PyIceberg support covers this use-case, the extension is much more efficient than loading the data into PyTable. I would love to see the support for Iceberg catalogs.