Integration with Apache DataBase Connectivity (ADBC)?
josevalim opened this issue · 2 comments
There is a recently announced effort for Arrow bindings to database: https://voltrondata.com/resources/arrow-database-connectivity-apache-arrow-for-every-database-user
I think this can be useful in two ways:
-
Someone can run a query to bring data already in Arrow format
-
We could convert Explorer queries into SQL queries for select database providing a unified flow. This is a path we started exploring with Explorer SQL but ADBC may be a more efficient way of going about it
In Explorer, we don't have to support both ways, but the first one can be useful and necessary to build the second.
Depending on how the project goes, we can get connectivity to DuckDB, Google BigQuery, and so on for free. We wouldn't get S3/GCS support but this may be provided directly by Arrow.
I think it holds a lot of promsie.
Would Apache Arrow Flight be a more stable integration to get started with?
https://arrow.apache.org/blog/2019/10/13/introducing-arrow-flight/
There are rust crates for arrow-flight and arrow-flight-sql-client.
My understanding is that Apache Flight may be used internally by ADBC, so for us it would be mostly considered an implementation detail.