/kubeflow-github-action

Repository for makeinga a GitHub Actions for deploying to Kubeflow.

Primary LanguagePython

This action Submits Kubeflow Pipelines to Kubeflow cluster running on Google Cloud Platform.

The purpose of this action is to allow for automated deployments of Kubeflow Pipelines on Google Cloud Platform (GCP). The action will collect the pipeline from a python file and compile it before uploading it to Kubeflow. The Kubeflow deployment must be using IAP on GCP to work.

Usage

Example Workflow that uses this action

To compile a pipeline and upload it to kubeflow:

name: Compile and Deploy Kubeflow pipeline
on: [push]

# Set environmental variables

jobs:
  build:
    runs-on: ubuntu-18.04
    steps:
    - name: checkout files in repo
      uses: actions/checkout@master


    - name: Submit Kubeflow pipeline
      id: kubeflow
      uses: NikeNano/kubeflow-github-action@master
      with:
        KUBEFLOW_URL: ${{ secrets.KUBEFLOW_URL }}
        ENCODED_GOOGLE_APPLICATION_CREDENTIALS: ${{ secrets.GKE_KEY }}
        GOOGLE_APPLICATION_CREDENTIALS: /tmp/gcloud-sa.json
        CLIENT_ID: ${{ secrets.CLIENT_ID }}
        PIPELINE_CODE_PATH: "example_pipeline.py"
        PIPELINE_FUNCTION_NAME: "flipcoin_pipeline"
        PIPELINE_PARAMETERS_PATH: "parameters.yaml"
        EXPERIMENT_NAME: "Default"
        RUN_PIPELINE: False
        VERSION_GITHUB_SHA: False

If you also would like to run it use the following:

name: Compile, Deploy and Run on Kubeflow
on: [push]

# Set environmental variables

jobs:
  build:
    runs-on: ubuntu-18.04
    steps:
    - name: checkout files in repo
      uses: actions/checkout@master


    - name: Submit Kubeflow pipeline
      id: kubeflow
      uses: NikeNano/kubeflow-github-action@master
      with:
        KUBEFLOW_URL: ${{ secrets.KUBEFLOW_URL }}
        ENCODED_GOOGLE_APPLICATION_CREDENTIALS: ${{ secrets.GKE_KEY }}
        GOOGLE_APPLICATION_CREDENTIALS: /tmp/gcloud-sa.json
        CLIENT_ID: ${{ secrets.CLIENT_ID }}
        PIPELINE_CODE_PATH: "example_pipeline.py"
        PIPELINE_FUNCTION_NAME: "flipcoin_pipeline"
        PIPELINE_PARAMETERS_PATH: "parameters.yaml"
        EXPERIMENT_NAME: "Default"
        RUN_PIPELINE: True
        VERSION_GITHUB_SHA: False

The repo also contains an example where the containers in the pipeline are versioned with the github hash in order to improve operations and tracking of errors. However this requires that the pipelines function to be wrapped in a function with one argument:

  def pipeline(github_sha :str):
      ... 
      

the containers is versioned with the hash:

  pre_image = f"gcr.io/{project}/pre_image:{github_sha}"
  train_forecast_image = f"gcr.io/{project}/train_forecast_image:{github_sha}"

for example see here

Mandatory inputs

  1. KUBEFLOW_URL: The URL to your kubeflow deployment
  2. GKE_KEY: Service account with access to kubeflow and rights to deploy, see here for example, the credentials needs to be bas64 encode:
cat path-to-key.json | base64
  1. GOOGLE_APPLICATION_CREDENTIALS: The path to where you like to store the secrets, which needs to be decoded from GKE_KEY
  2. CLIENT_ID: The IAP client secret
  3. PIPELINE_CODE_PATH: The full path to the python file containing the pipeline
  4. PIPELINE_FUNCTION_NAME: The name of the pipeline function the PIPELINE_CODE_PATH file
  5. PIPELINE_PARAMETERS_PATH: The pipeline parameters
  6. EXPERIMENT_NAME: The name of the kubeflow experiment within which the pipeline should run
  7. RUN_PIPELINE: If you like to also run the pipeline set "True"
  8. VERSION_GITHUB_SHA: If the pipeline containers are versioned with the github hash

Future work

Add so that pipelines can be scheduled to run as well. Soooon done!