Kubeflow Pipelines is a platform for deploying portable, scalable machine learning workflows on Kubernetes. This document walks through the steps to install and configure Kubeflow pipelines on a private OpenStack cloud platform.
The following steps walk through the configuration of GPU passthrough for a Kubernetes cluster running on OpenStack with Nvidia devices. If the Kubernetes deployment is already configured to enable GPUs please proceed to the [next section](## Install Kubeflow )
Follow platform specific mechanism for enabling GPUs on OpenStack Compute nodes.
Follow the Nvidia documentation to install and enable cuda-drivers and nvidia-container-runtime.
These steps can be run on any host and not necessarily on the Kubernetes cluster where you want to run your workloads.
curl -L -O https://github.com/kubeflow/kubeflow/releases/download/v0.5.0/kfctl_v0.5.0_linux.tar.gz
tar xf kfctl_v0.5.0_linux.tar.gz
sudo mv kfctl /usr/bin/
rm -f kfctl_v0.5.0_linux.tar.gz
curl -L -O https://github.com/kubeflow/kubeflow/releases/download/v0.5.0/kfctl_v0.5.0_darwin.tar.gz (if running from Mac)
tar xf kfctl_v0.5.0_darwin.tar.gz
sudo mv kfctl /usr/bin/
rm -f kfctl_v0.5.0_darwin.tar.gz
alias python=python3
pip3 install https://storage.googleapis.com/ml-pipeline/release/0.1.17/kfp.tar.gz --upgrade
If running the DSL compiler and kfctl from a different host than the Kubernetes cluster, make sure you configure Kubeconfig to point to that cluster.
kf-demo is the name we are giving for our sample application. This can be replaced with any name of your choice.
export KFAPP=kf-demo
kfctl init ${KFAPP} --version e6a363a51a4963a624758c931b5a13e56d98d8e0
cd ${KFAPP}
kfctl generate k8s
kfctl apply k8s
Some of the steps in a Kubeflow pipeline may share common data. To facilitate sharing of data from one stage of the pipeline to the next, we need a persistent volume store that allows ReadWriteMany access. If your storage provisioner does not support ReadWriteMany, one solution is to deploy an NFS server using the ReadWriteOnce storage that is provided by the default storage provisioner. You can do it as follows:
helm repo update
helm install stable/nfs-server-provisioner --name kf --set=persistence.enabled=true,persistence.storageClass=standard,persistence.size=200Gi
Please note that the above assumes your default storage class is named standard. This is the case in GKE and some on-prem deployments. For AKS, the default storage class is named default
, so you need to run the following instead:
helm repo update
helm install stable/nfs-server-provisioner --name kf --set=persistence.enabled=true,persistence.storageClass=default,persistence.size=200Gi
Please see the nfs provisioner helm chart documentation for more configuration options.
Create the Persistent Volume Claims required for mysql and argo claims.
kubectl -n kubeflow apply -f pvc/
You are ready to deploy Kubeflow Pipelines at this point. Kubeflow Pipelines comes with built in examples that you can try out to get started.
The example we've used in the demo is similar to this tutorial and uses this model.