delta-io/connectors

Proposal: Unify this Delta Connectors and Delta-Spark codebases

tdas opened this issue · 4 comments

tdas commented

In Nov 2019, Delta Connectors repo started as an incubator project to foster the independent development of the non-spark connectors allowing the project to proceed with a different cadence than the Delta/Spark connector in the original Delta repo (www.github.com/delta-io/delta). Almost 4 years later, we have multiple popular and stable connectors now
Standalone - Enjoys ~40K downloads/month, and has enabled building many connectors (the following ones in this repo, and external ones like PrestoDB, Pulsar, Apache Beam).
Flink - Processing 100s of TB/day at a premier food delivery company
Hive2 and Hive3
PowerBI

While we have enjoyed the advantage of being able to move at a faster pace to reach this scale than the Delta-on-Spark, as maintainers, we are currently facing a few major downsides.

  • Complex dependencies of modules across repos - Delta Standalone project in connectors repo depends on the Delta Storage project (Maven artifact delta-storage) in the original Delta repo (delta-io/delta). Furthermore, newer initiatives like Delta Kernel are located in that original repo (since it hosts the protocol), and will create further cross-repo dependencies.

  • Maintenance overheads - Each repo has its own release cycle, thus requiring two sets of releases every quarter. As project maintainers, this is double the overhead which causes releases to be slowed down.

To improve this situation in the future, I propose we merge the connectors code into the original Delta repo. We will unify the release cycles of Delta/Spark and Delta Connectors.

I think this is a great idea and due to its maturity, will simplify maintenance.

tdas commented

This is cross-listed in the delta repo as well, delta-io/delta#1824

tdas commented

All the code in this repo has been merged with full history in the original delta repo, see delta-io/delta#1824. Closing this issue. We will mark this repo has deprecated and communicate with all the open PRs and issues appropriately.

Steps to migrate existing PR on delta-io/connectors to delta-io/delta.

  • Create a patch file with the diff of the PR by appending .patch after the PR link.
    • For example: https://github.com/delta-io/connectors/pull/559.patch
    • Save the patch content locally in a file.
  • Update the file paths in the locally saved file by prepending a "connectors/" in each path.
  • Apply the path file in your local repository of delta-io/delta