delta-io/connectors

[FEATURE REQUEST] Basic data (i.e. parquet file) writing support

scottsand-db opened this issue · 2 comments

Currently, Delta Standalone only supports metadata (i.e. commits to the _delta_log) writing, metadata reading, and very basic data (i.e. parquet files) reading.

There has been growing interest and requests for Delta Standalone to provide data writing as well.

Let's use this issue as a place where users can give more details on the use cases, APIs, and interest in this feature.

Here's a rudimentary prototype I'm working on to deserialize Kafka to Delta - current iteration includes a barebones Java Parquet Writer that also commits to Delta: https://github.com/mdrakiburrahman/kafka-delta-ingest-adls/blob/main/src/main/java/com/microsoft/kdi/KDI.java

I think as it stands this should meet a Hello World type scenario for Delta Standalone newcomers. The repo above is already in a VSCode DevContainer so anyone can reproduce it.