This guide shows you how to get working Kubernetes clusters on a SCS cloud via cluster-api.
Cluster API requires an existing Kubernetes cluster. It is built with kind on an OpenStack instance created via Terraform. This instance can be used later on for the management of the newly created cluster, or for creating additional clusters.
Basically, this repository covers two topics:
- Automation (terraform, Makefile) to bootstrap a cluster-API management node by installing kind on a vanilla Ubuntu image and deploying some tools on this node ( kubectl, openstack CLI tools, k9s, cilium, calico, helm, flux ...) and deploying [cluster-API]((https://cluster-api.sigs.k8s.io/) (clusterctl) and the OpenStack cluster-api provider along with suitable credentials. The terraform automation is driven by a Makefile for convenience. The tooling also contains all the logic to clean up again. The newly deployed node clones this git repository early in the bootstrap process and uses the thus received files to set up the management cluster and scripts.
- This node can be connected to via ssh and the deployed scripts there can be
used to manage workload clusters and then deploy various standardized tools (such
as e.g. OpenStack Cloud Controller Manager(OCCM),
cinder CSI,
calico or cilium CNI,
nginx ingress controller,
cert-manager, ...) and run tests (e.g. CNCF conformance
with sonobuoy). i
The tools and artifacts can be updated via
git pull
at any time and the updated settings rolled out to the workload clusters. Note that the script collection will eventually be superceded by the capi-helm-charts. The medium-term goal is to actually create a reconciliation loop here that would perform life-cycle-management for clusters according to the cluster configuration stored in an enhanced cluster-api style clusterctl.yaml from git repositories and thus allow a pure gitops style cluster management without ever ssh'ing to the management node.
Creating and scaling k8s clusters on demand is providing a lot of flexibility to DevOps teams that develop, test, deploy and operate services and applications.
We expect the functionality to be mainly consumed in two scenarios:
-
Self-service: The DevOps team leverages the code provided from this repository to create their own CAPI management server and use it then to manage a number of k8s clusters for their own needs.
-
Managed k8s: The Operator's service team creates the CAPI management server and uses it to provide managed k8s clusters for their clients.
Note that we have an intermediate model in mind -- a model where a one-click / one-API call interface would allow the management server to be created on behalf of a user and then serve as an API endpoint to that user's k8s capi needs. Ideally with some dashboard or GUI that would shield less experienced users from all the YAML.
Once we have the gitops style cluster control working, the self-service model will become more convenient to use.
The preparations are done on a deployment host, posssibly a tiny jumphost style VM, or some Linux/MacOS/WSL laptop.
- Terraform must be installed (https://learn.hashicorp.com/tutorials/terraform/install-cli).
- You need to have
yq
(python3-yq or yq snap) and GNU make installed. - You must have credentials to access the cloud. terraform will look for
clouds.yaml
andsecure.yaml
in the current working directory, in~/.config/openstack/
and/etc/openstack
(in this order), just like the openstack client. (https://docs.openstack.org/python-openstackclient/latest/configuration/index.html#clouds-yaml) - As the
v3applicationcredential
auth_type
plugin is being used, we hit a bug in Ubuntu 20.04 which ships python3-keystoneauth < 4.2.0, which does fail with unversionedauth_url
endpoints. (See OpenStack bug 1876317.) While we try to patch the bug away in the deployed instance, the patching mechanism is not very robust, so we recommend you have a versionedauth_url
endpoint (with a trailing/v3
). - Copy the environments sample file from environments/environment-default.tfvars to
environments/environment-<yourcloud>.tfvars
and provide the necessary information like machine flavor or machine image. You can comment out all lines where the defaults match your needs. (See below for more details on this.) - Pass
ENVIRONMENT=<yourcloud>
to themake
command or exportENVIRONMENT
from your shell's environment. If the name of the environment equals the name of the cloud as specified in yourclouds.yaml
, you can also just setOS_CLOUD
in your shell's environment. (You can also edit the default in the Makefile, though we don't recommend this.)
make create
This will create the management host.
It creates an application credential, networks, security groups and a virtual machine
which gets bootstrapped with cloning this git repository, installation of some tool
and a local kubernetes cluster (with kind), where the cluster API provider will be
installed and which will provide the
API server for the k8s CAPI. If the number of control nodes controller_count
in
your config (environment-<yourcloud>.tfvars
) is zero, then that's all that is done.
Otherwise, a testcluster will be created using k8s CAPI.
The subsequent management of the cluster can best be done from the management host VM,
as it has all the tools deployed there and config files can be edited and resubmitted
to the kubernetes kind cluster for reconciliation. To log in to this management machine
via ssh, you can issue make ssh
.
You can create and do life cycle management for many more clusters from this management node.
The kubeconfig with admin
power for the created testcluster is named testcluster/testcluster.yaml
(or
$CLUSTER_NAME/$CLUSTER_NAME.yaml
for all the other clusters) and can be handed out to
users that should get full administrative control over the cluster. You can also retrieve
them using make get-kubeconfig TESTCLUSTER=${CLUSTER_NAME}
, and possibly create an
encrypted .zip file for handing these out. (You can omit TESTCLUSTER=...
for the
default testcluster.)
make clean
does ssh to the C-API management server to clean up the created clusters prior
to terraform cleaning up the resources it has created. This is sometimes insufficient to clean up
unfortunately, some error in the deployment may result in resources left around.
make fullclean
uses a custom script (using the
openstack CLI) to clean up trying to not hit any resources not created by the capi or terraform.
It is the recommended way for doing cleanups if make clean
fails. Watch out for leftover
floating IP addresses and persistent volumes, as these can not be easily traced back to the
cluster-API created resources and may thus be left.
You can purge the whole project via make purge
. Be careful with that command as it will purge
all resources in the OpenStack project even those that have not been created through this
Terraform script or the cluster API.
It requires the ospurge
script.
Install it with python3 -m pip install git+https://git.openstack.org/openstack/ospurge
.
Note that clean
and fullclean
leave the ubuntu-capi-image-$KUBERNETES_VERSION
image registered,
so it can be reused.
You need to manually unregister it, if you want your next deployment to register a new image with
the same kubernetes version number.
make ssh
create_cluster.sh <CLUSTER_NAME>
This will copy the defaults from ~/cluster-defaults/
into a directory with your
cluster name and then ask cluster-api to create the cluster. The scripts also take
care of security groups, anti-affinity, node image registration (if needed) and
of deploying CCM, CNI, CSI as well as optional services such as metrics or nginx-ingress
controller.
You can access the new cluster with kubectl --context clustername-admin@cluster
or KUBECONFIG=~/clustername/clustername.yaml kubectl
.
The management cluster is in context kind-kind
.
Please see the Maintenance and Troubleshooting Guide in the doc/
directory.
https://github.com/SovereignCloudStack/k8s-cluster-api-provider/blob/main/doc/Maintenance_and_Troubleshooting.md
for the plusserver community environment it can choose here:
export ENVIRONMENT=gx-scs
or insert inside of Makefile:
ENVIRONMENT=gx-scs
for the wavestack environment it can choose:
export ENVIRONMENT=gx-wavestack
or insert inside of Makefile:
ENVIRONMENT=gx-wavestack
a cloud.yaml and secure.yaml will needed for the environments in side of terraform folder.
You can use this repository as a starting point for some automation e.g. adding
kubernetes manifests to the cluster or to run custom shell scripts in the end.
To do so place your files in the terraform/extension
folder. They will be
uploaded to the management cluster. Files ending in *.sh
will be executed
in alphabetical order. All other files will just be uploaded. If you want to
deploy resources in the new cluster-api-maintained cluster you can use kubectl apply -f <your-manifest.yaml> --kubeconfig ~/$CLUSTER_NAME/$CLUSTER_NAME.yaml
to do so.
The terraform creates an application credential that it passes into the created VM. This one is then used to authenticate the cluster API provider against the OpenStack API to allow it to create resources needed for the k8s cluster.
The AppCredential has a few advantages:
- We take out variance in how the authentication works -- we don't have to deal with a mixture of project_id, project_name, project_domain_name, user_domain_name, only a subset of which is needed depending on the cloud.
- We do not leak the user credentials into the cluster, making any security breach more easy to contain.
- AppCreds are connected to one project and can be revoked.
We are using an inrestricted AppCred which can create further AppCreds, so we can each cluster its own (restricted) credentials. This is not yet implemented, but will soon be. Currently, all clusters created from the management node will belong to the same OpenStack project and use the same credentials.
The plan for the future is to create AppCreds per cluster (see #109), so credentials for individual clusters can be revoked.
You can use make ssh
to log in to the C-API management node. There you can issue
clusterctl
and kubectl
(aliased to k
) commands. The context kind-kind
is used for the C-API management while the context testcluster-admin@testcluster
can
be used to control the workload cluster testcluster
. You can of course create many
of them. There are management scripts on the management node:
- In the user's (ubuntu) home directory, create a subdirectory with the CLUSTERNAME
to hold your cluster's configuration data. Copy over the
clusterctl.yaml
file from~/cluster-defaults/
and edit it to meet your needs. Note that you can also copy overcloud.conf
andcluster-template.yaml
and adjust them, but you don't need to. (If you don't create the subdirectory, thecreate_cluster.sh
script will do so for you and use all defaults settings.) create_cluster.sh CLUSTERNAME
: Use this command to create a cluster with the settings from~/$CLUSTERNAME/clusterctl.yaml
. More precisely, it uses the template$CLUSTERNAME/cluster-template.yaml
and fills in the the settings from$CLUSTERNAME/clusterctl.yaml
to render a config file$CLUSTERNAME/$CLUSTERNAME-config.yaml
which will then be submitted to the capi server (kind-kind
context) for creating the control plane nodes and worker nodes. The script will also apply openstack integration, cinder CSI, calico or cilium CNI, and optionally also metrics server, nginx ingress controller, flux, cert-manager. (These can be controlled byDEPLOY_XXX
variables, see below. Defaults can be preconfigured from the environment.tfvars file during management node creation.) Note thatCLUSTERNAME
defaults totestcluster
and must not contain whitespace. The script also makes sure that appropriate capi images are available (it grabs them from OSISM as needed and registers them with OpenStack, following the SCS image metadata standard. The script returns once the control plane is fully working (the worker nodes might still be under construction). The kubectl file to talk to this cluster (as admin) can be found in~/$CLUSTERNAME/$CLUSTERNAME.yaml
. Expect the cluster creation to take ~8mins. (CLUSTERNAME defaults to testcluster.) You can pass--context=${CLUSTERNAME}-admin@$CLUSTERNAME
tokubectl
(with the default~/.kubernetes/config
config file) orexport KUBECONFIG=$CLUSTERNAME.yaml
to talk to the workload cluster.- The subdirectory
~/$CLUSTERNAME/deployed-manifests.d/
will contain the deployed manifests for reference (and in case of nginx-ingress also to facilitate a full cleanup). - The
clusterctl.yaml
file can be edited thecreate_cluster.sh
script be called again to submit the changes. (If you have not done any changes, re-running the script again is harmless.) Note that thecreate_cluster.sh
does not currently remove any of the previously deployed services/deployments from the workload clusters -- this will be added later on with the appropriate care and warnings. Also note that not all changes are allowed. You can easily change the number of nodes or add k8s services to a cluster. For changing machine flavors, machine images, kubernetes versions ... you will need to also increase theCONTROL_PLANE_MACHINE_GEN
or theWORKER_MACHINE_GEN
counter to add a different suffix to these read-only resources. This will cause Cluster-API to orchestrate a rolling upgrade for you on rollout. (This is solved more elegantly in the helm chart style cluster management, see below.) - The directory
~/k8s-cluster-api-provider/
contains a checked out git tree from the SCS project. It can be updated (git pull
) to receive the latest fixes and improvements. This way, most incremental updates do not need the recreation of the management node (and thus also not the recreation of your managed workload clusters), but can be applied with callingcreate_cluster.sh
again to the workload clusters. - The installaton of the openstack integration, cinder CSI, metrics server and
nginx ingress controller is done via the
bin/apply_*.sh
scripts that are called fromcreate_cluster.sh
. You can manually call them as well -- they take the cluster name as argument. (It's better to just callcreate_cluster.sh
again, though.) The applied yaml files are collected in~/$CLUSTERNAME/deployed-manifests.d/
. You cankubectl delete -f
them to remove the functionality again. - You can of course also delete the cluster and create a new one if that level of disruption is fine for you. (See below in Advanced cluster templating with helm to get an idea how we want to make this more convenient in the future.)
- Use
kubectl get clusters
in thekind-kind
context to see what clusters exist. Usekubectl get all -A
in thetestcluster-admin@testcluster
context to get an overview over the state of your workload cluster. You can access the logs from the capo controller in case you have trouble with cluster creation. delete_cluster.sh [CLUSTERNAME]
: Tell the capi mgmt server to remove the cluster $CLUSTERNAME. It will also remove persistent volume claims belonging to the cluster. The script will return once the removal is done.cleanup.sh
: Remove all running clusters.add_cluster-network.sh CLUSTERNAME
adds the management node to the node network of the clusterCLUSTERNAME
, assuming that it runs on the same cloud (region).remove_cluster-network.sh
undoes this again. This is useful for debugging purposes.
For your convenience, k9s
is installed on the management node as well
as calicoctl
, cilium
, hubble
, cmctl
, helm
and sonobuoy
.
These binaries can all be found in /usr/local/bin
while the helper scripts
have been deployed to ~/bin/
.
While the scripts all use a default testcluster
, they have
been developed and tested to manage many clusters from a single management
node. Copy the ~/cluster-defaults/clusterctl.yaml
file to
~/MYCLUSTER/clusterctl.yaml
and edit the copy to describe the properties of the cluster to be created.
Use ./create_cluster.sh MYCLUSTER
then to create a workload cluster
with the name MYCLUSTER
. You will find the kubeconfig file in
~/MYCLUSTER/MYCLUSTER.yaml
, granting its owner admin access to that cluster.
Likewise, delete_cluster.sh
and the apply_*.sh
scripts take a
cluster name as parameter.
This way, dozens of clusters can be controlled from one management node.
You can add credentials from different projects into
~/.config/openstack/clouds.yaml
and reference them in the OPENSTACK_CLOUD
setting in clusterctl.yaml
, this way managing clusters in many different
projects and even clouds from one management host.
To test the created k8s cluster, there are several tools available.
Apply all commands to the testcluster context (by passing the appropriate
--context
setting to kubectl
or by using the right KUBECONFIG
file).
-
Looking at all pods (
kubectl get pods -A
) to see that they all come up (and don't suffer excessive restarts) is a good first check. Look at the pod logs to investigate any failures. -
You can create a very simple deployment with the provided
kuard.yaml
, which is an example taken from the O'Reilly book from B. Burns, J. Beda, K. Hightower: "Kubernetes Up & Running" enhanced to also use a persistent volume. -
You can deploy Google's demo microservice application.
-
sonobuoy
runs a subset of the k8s tests, providing a simple way to filter the >5000 existing test cases to only run the CNCF conformance tests or to restrict testing to non-disruptive tests. Thesonobuoy.sh
wrapper helps with calling it. There are alsoMakefile
targetscheck-*
that call various sonobuoy test sets. This is how we call sonobuoy for our CI tests. -
You can use
cilium connectivity test
to check whether your cilium CNI is working properly. You might need to enable hubble to get a fully successful result.
As of 9/2022, our tests cover 1.21.latest ... 1.25.latest. All of them pass the sonobuoy CNCF conformance tests.
There is an upgrade guide in docs/Upgrade-Guide.md https://github.com/SovereignCloudStack/k8s-cluster-api-provider/blob/main/doc/Upgrade-Guide.md
While testing clusters with >= 3 control nodes, we have observed occasional transient error messages that reported an etcd leader change preventing a command from succeeding. This could result in a dozen of random failed tests in a sonobuoy conformance run. (Retrying the commands would let them succeed.)
Too frequent etcd leader changes are detrimental to your control plane performance and can lead to transient failures. They are a sign that the infrastructure supporting your cluster is introducing too high latencies (>100ms in the default configuration which we don't change by default, see below).
We recommend to deploy the control nodes (which run etcd) on instances with local SSD storage (which we reflect in the default flavor name) and recommend using flavors with dedicated cores and that your network does not introduce latencies by significant packet drop.
We now always use slower heartbeat (250ms) and increase CPU and IO priority
which used to be controlled by ETCD_PRIO_BOOST
. This is safe.
If you build multi-controller clusters and can not use a flavor with low
latency local storage (ideally SSD), you can also work around this with
ETCD_UNSAFE_FS
. ETCD_UNSAFE_FS
is using
barrier=0
mount option, which violates filesystem ordering guarantees.
This works around storage latencies, but introduces the risk of inconsistent
filesystem state and inconsistent etcd data in case of an unclean shutdown.
You may be able to live with this risk in a multi-controller etcd setup.
If you don't have flavors that fulfill the requirements (low-latency
storage attached), your choice is between a single-controller cluster
(without ETCD_UNSAFE_FS
) and a multi-controller cluster with
ETCD_UNSAFE_FS
. Neither option is perfect, but we deem the
multi-controller cluster preferrable in such a scenario.
The provided cluster-template.yaml
assumes that all control nodes
on one hand and all worker nodes on the other are equal. They are in the
same cloud within the same availablity zone, using the same flavor.
cluster API allows k8s clusters to have varying flavors, span availability
zones and even clouds. For this, you can create an advanced
cluster-template with more different machine descriptions and potentially
several secrets. Depending on your changes, the logic in create_cluster.sh
might also need enhancements to handle this. Extending this is not hard
and we're happy to hear from your use cases and take patches.
However, we are currently investigating to use helm templating for anything beyond the simple use cases instead, see next chapter.
On the management node, we have not only helm installed, but also the repository https://github.com/stackhpc/capi-helm-charts checked out. Amongst other things, it automates the creation of new machine templates when needed and doing rolling updates on your k8s cluster with clusterctl. This allows for an easy adaptation of your cluster to different requirements, new k8s versions etc.
Please note that this is currently evolving quickly and we have not completely assessed and tested the capabilities. We intend to do this after R2 and eventually recommend this as the standard way of managing clusters in production. At this point, it's included as a technical preview.
The provenance capo means that this setting comes from the templates used by the cluster-api-provider-openstack, while SCS denotes that this setting has been added by the SCS project..
Parameters controlling the Cluster-API management node (capi mgmt node) creation:
environment | clusterctl.yaml | provenance | default | meaning |
---|---|---|---|---|
prefix |
SCS | capi |
Prefix used for OpenStack resources for the capi mgmt node | |
kind_flavor |
SCS | SCS-1V:4:20 |
Flavor to be used for the k8s capi mgmt node | |
image |
SCS | Ubuntu 20.04 |
Image to be deployed for the capi mgmt node | |
ssh_username |
SCS | ubuntu |
Name of the default user for the image |
|
clusterapi_version |
SCS | 1.2.2 |
Version of the cluster-API incl. clusterctl |
|
capi_openstack_version |
SCS | 0.6.3 |
Version of the cluster-api-provider-openstack (needs to fit the capi version) |
Parameters controlling both management node creation and cluster creation:
environment | clusterctl.yaml | provenance | default | meaning |
---|---|---|---|---|
cloud_provider |
OPENSTACK_CLOUD |
capo | OS_CLOUD name in clouds.yaml |
|
external |
OPENSTACK_EXTERNAL_NETWORK_ID |
capo | Name/ID of the external (public) OpenStack network | |
dns_nameservers |
OPENSTACK_DNS_NAMESERVERS |
capo | [ "5.1.66.255", "185.150.99.255" ] |
Array of nameservers for capi mgmt nodes and for cluster nodes, replace the FF MUC defaults with local servers if available |
availability_zone |
OPENSTACK_FAILURE_DOMAIN |
capo | Availability Zone(s) for the mgmt node / workload clusters | |
kind_mtu |
MTU_VALUE |
SCS | 0 |
MTU for the mgmt node; Calico is set 50 bytes smaller; 0 means autodetection |
Parameters controlling the cluster creation:
environment | clusterctl.yaml | provenance | default | meaning |
---|---|---|---|---|
node_cidr |
NODE_CIDR |
SCS | 10.8.0.0/20 |
IPv4 address range (CIDR notation) for workload nodes |
use_cilium |
USE_CILIUM |
SCS | false |
Use cilium as CNI instead of calico |
calico_version |
SCS | v3.24.1 |
Version of the Calico CNI provider (ignored if use_cilium is set) |
|
kubernetes_version |
KUBERNETES_VERSION |
capo | v1.23.x |
Kubernetes version deployed into workload cluster (.x means latest patch release) |
|
OPENSTACK_IMAGE_NAME |
capo | ubuntu-capi-image-${KUBERNETES_VERION} |
Image name for k8s controller and worker nodes |
kube_image_raw |
OPENSTACK_IMAGE_RAW |
SCS | true |
Register images in raw format (instead of qcow2), good for ceph COW |
image_registration_extra_flags |
OPENSTACK_IMAGE_REGISTATION_EXTRA_FLAGS |
SCS | "" |
Extra flags passed during image registration |
|
OPENSTACK_CONTROL_PLANE_IP |
capo | 127.0.0.1 |
Use localhost to talk to capi cluster (don't change on capi mgmt node) |
|
OPENSTACK_SSH_KEY_NAME |
capo | ${prefix}-keypair |
SSH key name generated and used to connect to workload cluster nodes |
controller_flavor |
OPENSTACK_CONTROL_PLANE_MACHINE_FLAVOR |
capo | SCS-2D:4:20s |
Flavor to be used for control plane nodes |
worker_flavor |
OPENSTACK_NODE_MACHINE_FLAVOR |
capo | SCS-2V:4:20 |
Flavor to be used for worker nodes |
controller_count |
CONTROL_PLANE_MACHINE_COUNT |
capo | 1 |
Number of control plane nodes in testcluster (0 skips testcluster creation) |
`` | CONTROL_PLANE_MACHINE_GEN |
SCS | genc01 |
Suffix for control plane node resources, to be changed for rolling upgrades |
worker_count |
WORKER_MACHINE_COUNT |
capo | 3 |
Number of worker nodes in testcluster |
`` | WORKER_MACHINE_GEN |
SCS | genw01 |
Suffix for worker node resources, to be changed for rolling upgrades |
anti_affinity |
OPENSTACK_ANTI_AFFINITY |
SCS | true |
Use anti-affinity server groups to prevent k8s nodes on same host (soft for workers, hard for controllers) |
|
OPENSTACK_SRVGRP_CONTROLLER |
SCS | nonono |
Autogenerated if anti_affinity is true , eliminated otherwise |
|
OPENSTACK_SRVGRP_WORKER |
SCS | nonono |
Autogenerated if anti_affinity is true , eliminated otherwise |
deploy_occm |
DEPLOY_OCCM |
SCS | true |
Deploy the given version of OCCM into the cluter. true (default) choses the latest version matching the k8s version. You can specify master to chose the upstream master branch. Don't disable this. |
deploy_cindercsi |
DEPLOY_CINDERCSI |
SCS | true |
Deploy the given (or latest matching for the default true value) of cinder CSI. |
etcd_prio_boost |
ETCD_PRIO_BOOST |
SCS | true |
Ignored: We always use longer heartbeat (250ms)/election-timeout(2.5s) and higher prio IO and CPUs now. (Legacy setting.) |
etcd_unsafe_fs |
ETCD_UNSAFE_FS |
SCS | false |
Use barrier=0 for filesystem on control nodes to avoid storage latency. Use for multi-controller clusters on slow/networked storage, otherwise not recommended. |
testcluster_name |
(cmd line) | SCS | testcluster |
Allows setting the default cluster name, created at bootstrap (if controller_count is larger than 0) |
Optional services deployed to cluster:
environment | clusterctl.yaml | provenance | default | script | meaning |
---|---|---|---|---|---|
deploy_metrics |
DEPLOY_METRICS |
SCS | true |
apply_metrics.sh |
Deploy metrics service to nodes to make kubectl top work |
deploy_nginx_ingress |
DEPLOY_NGINX_INGRESS |
SCS | true |
apply_nginx_ingress.sh |
Deploy NGINX ingress controller (this spawns an OpenStack Loadbalancer), pass version to explicitly choose the version, true results in v1.3.0 (v1.0.2 for k8s <= 1.19) |
|
NGINX_INGRESS_PROXY |
SCS | false |
(ditto) | Configure LB and nginx to get real IP via PROXY protocol; may cause trouble for pod to LB connections. |
deploy_cert_manager |
DEPLOY_CERT_MANAGER |
SCS | false |
apply_cert_manager.sh |
Deploy cert-manager, pass version (e.g. v1.9.1 ) to explicitly choose a version |
deploy_flux |
DEPLOY_FLUX |
SCS | false |
Deploy flux2 into the cluster |
- Opt-in for per cluster project (extends #109)
- Allow service deletion from
create_cluster.sh
(#137, see also #131) - More pre-flight checks in
create_clster.sh
(#111). - Implement (optional) harbor deployment using k8s-harbor. (#139)
- Move towards gitops style cluster management. (Design Doc in Docs repo PR #47 - draft)