airbytehq/PyAirbyte

Feature Request: Snowflake Snowpark support via registering PyAirbyte on the Snowpark Anaconda Channel

aaronsteers opened this issue · 17 comments

What

We'd like Snowflake users to be able to run PyAirbyte in Snowpark environments.

Why

This would enable Snowflake users to run Airbyte directly in their Snowflake environment, and to orchestrate their Airbyte Cloud jobs directly in Snowflake.

How

The first step towards this is to register PyAirbyte in Snowflake's Anaconda channel and/or repo.

Caveats / Gotchas

As a next step, we'll also need to also consider how to deliver the connectors themselves. This is tricky because Snowpark (today) doesn't support creating new virtual environments or to install bespoke connectors via pip.

Workarounds:

  1. When PyAirbyte supports natively running low-code connectors from their manifest files, this will open up a large number of REST API connectors that would be available without additional installation steps. (Work started here: #175)
  2. We could also add our library of connectors to the Snowflake anaconda channel. After this ticket's work is complete, we'd have repeatable learnings from publishing PyAirbyte itself, and could then use those learnings to publish connectors as well as PyAirbyte itself.

Hi @aaronsteers . I'd like to attempt this task

It is yours @itsxdamdam !

Hi, @itsxdamdam! Excited to see you picked this up! Do you need any assist? Let us know if you run into any issues!

Yes i will definitely do that

Hi @aaronsteers @marcosmarxm i am very very new to snowflake so this has been a learning curve for me. I'd appreciate if you can recommend more resources to help complete this feature.

Futhermore, I'll be giving more update as I progress.

@itsxdamdam - We have limited bandwidth to assist, unfortunately. I have pinged internally to see if we have a guide to onboarding to Snowflake/Snowpark.

In lieu of more specific instructions for onboarding to Snowpark (specifically), we would accept this task as "completed" if you can alternatively help us to add PyAirbyte to Conda Forge, a popular open Anaconda channel where many popular AI and ML packages are hosted. Presumably, it would be minimal additional effort make available on Snowpark if we have published PyAirbyte already to Conda Forge.

These resources might be helpful:

(Conda is not my personal expertise, since I've primarily worked with Pip/Poetry/PyPi in past projects. Hence, the need for this new exploration and learnings. 😄 )

Thanks @aaronsteers will take this into consideration

Hi @aaronsteers thanks for your recommendation. I made a PR to Conda Forge. Currently working through the checks. However, i ran into some missing dependecies errors. Currently working through that.

You can find the PR here
conda-forge/staged-recipes#26787

Hi @aaronsteers @marcosmarxm I have to create packages for airbyte-protocol on conda-forge to successfully merge pyairbyte. Unfortunately I have an add airbyte-protocol related to the version of python-dotenv. I'd appreciate it if we can look into this

Hi @aaronsteers
Please take a look at this error from a file in https://github.com/airbytehq/airbyte-protocol preventing the merge to Conda-forge.

You can find details here
conda-forge/staged-recipes#26787 (comment)

I have also submitted an issue on airbyte-protocol to this effect.

airbytehq/airbyte-protocol#88

Thanks for all info @itsxdamdam we're going to take a look during the day/

Hi @marcosmarxm @aaronsteers I didn't get an update on the corrections required to complete the merge to conda-forge. I believe I have done all that is required for the addition to conda-forge. I'm waiting for the required corrections to be made upstream i.e. airbyte-protocol.

Kindly read through the comments of the PR made to conda-forge for all details.

@itsxdamdam - Agreed. You have done all that's expected. I haven't been able to read all the notes but you can count this as complete if there are sufficient steps documented so that we can move forward from here. Thanks very much for continuing to this one and for working through the various challenges.

We will review internally after the event is wrapped up and make decisions for next steps based on level of effort to get published and maintain afterwards. Your contribution has been very much appreciated.

Thanks @aaronsteers I'm happy to take on a new task if available.

@itsxdamdam - Agreed. You have done all that's expected. I haven't been able to read all the notes but you can count this as complete if there are sufficient steps documented so that we can move forward from here. Thanks very much for continuing to this one and for working through the various challenges.

We will review internally after the event is wrapped up and make decisions for next steps based on level of effort to get published and maintain afterwards. Your contribution has been very much appreciated.

Also can I fill the form for the hackathon submission