e2fyi/kubeflow-aws

1.5.0 support and collaboration on ArgoCD-based AWS rollout

karlschriek opened this issue · 1 comments

Firstly, thanks a lot for this community effort! We are currently running several KF1.2 distributions on AWS where we integrate as tightly as possible with AWS managed services and using your solution for S3/RDS in Pipelines is an important part of that. We offload on-cluster services to AWS as far as possible. This means in particular:

  • Using Cognito for user pool management
  • Using RDS for all metadata storage (pipelines, metadb, cachedb, katibdb) instead of an on-cluster MySQL db
  • Using S3 for all pipeline artifact storage instead of using Minio for on-cluster storage
  • Using Secret Manager in combination with https://github.com/external-secrets/kubernetes-external-secrets in order to manage secrets (for example RDS database credentials)
  • Using IRSA to manage granular pod-level access to AWS resources

In addition, we use ArgoCD as GitOps operator try to avoid any middleware such as kftctl / Kubeflow Operator we rolling out. We have started a new community effort that aims to do that for Kubeflow 1.3 here: https://github.com/argoflow/argoflow-aws (currently still under construction). We plan to integrate the solution that you have developed here as well, and would also welcome any direct contributions from you!

Most important for us right now though is that KF 1.3 used Pipelines 1.5.0. Your latest version is for Pipelines 1.4.1. Are you aware of any changes that are necessary to upgrade to 1.5.0 and do you intend to upgrade soon?

@karlschriek
This sound exciting! RDS support was what I always wanted to do but never found time for it.

Would be very happy to collab and contribute!

Haven't been following the recent updates to kfp, so not sure what are the major changes. But let me spend some time to have a look at the recent commits.