k0smotron and Cluster API Provider Openstack (CAPO) don't work together
michaelbayr opened this issue · 6 comments
Hey everyone,
thanks for this awesome project! I was curious to test k0smotron as a control plane for openstack workers. I checked out the AWS/Hetzner examples from the mirantis resources and tried to come up with a cluster.yaml to post.
I am running CAPI 1.6.0, CAPO 0.8.0 and k0smotron 0.7.2.
The placeholders in brackets e.g. {b64CERT} I just did to redact credentials.
When I apply the following yaml, the k0smotron controlplane spawns, allocates a loadbalancer and is accessible via kubeconfig. However the openstack worker machine is never spawned. The credentials are correct since CAPO logs the check, but CAPO never reaches the point where it actually boots up any machines. It justs sits there without any errors and waits.
The output of clusterctl:
NAME READY SEVERITY REASON SINCE MESSAGE
Cluster/testcluster1 False Info WaitingForInfrastructure 42m
├─ClusterInfrastructure - OpenStackCluster/testcluster1
├─ControlPlane - K0smotronControlPlane/testcluster1
└─Workers
└─MachineDeployment/testcluster1-md-0 False Warning WaitingForAvailableMachines 2h Minimum availability requires 1 replicas, current 0 available
└─Machine/testcluster1-md-0-vmnk4-k29bw False Info WaitingForClusterInfrastructure 2h 1 of 2 completed
└─BootstrapConfig - K0sWorkerConfig/testcluster1-md-0-5qmzp
The posted YAML:
apiVersion: v1
kind: Namespace
metadata:
name: testcluster # Namespace in which we want to deploy the child cluster
spec: {}
status: {}
---
apiVersion: v1
data:
cacert: {b64CERT}
clouds.yaml: {b64CLOUDS}
kind: Secret
metadata:
labels:
clusterctl.cluster.x-k8s.io/move: "true"
name: testcluster1-cloud-config
namespace: testcluster
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: testcluster1 # Cluster Name
namespace: testcluster
spec:
clusterNetwork:
pods:
cidrBlocks: [10.244.0.0/16]
services:
cidrBlocks: [10.96.0.0/12]
controlPlaneRef:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: K0smotronControlPlane
name: testcluster1 # Reference to the k0s control plane we are creating in the management cluster
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha7
kind: OpenStackCluster
name: testcluster1 # Reference to the HetznerCluster CRD that trigger the Hetzner Controller Manager
---
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: K0smotronControlPlane
metadata:
name: testcluster1 # Cluster Name
namespace: testcluster
spec:
k0sVersion: v1.27.8-k0s.0 # https://github.com/k0sproject/k0s/releases
persistence:
type: emptyDir
service:
type: LoadBalancer
apiPort: 6443
konnectivityPort: 8132
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha7
kind: OpenStackCluster
metadata:
name: testcluster1
namespace: testcluster
annotations:
cluster.x-k8s.io/managed-by: k0smotron
spec:
apiServerLoadBalancer:
enabled: false
cloudName: {CLOUDNAME}
dnsNameservers:
- {NAMESERVERS}
externalNetworkId: {EXTERNALNET}
identityRef:
kind: Secret
name: testcluster1-cloud-config
managedSecurityGroups: true
nodeCidr: 10.8.0.0/20
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
name: testcluster1-md-0
namespace: testcluster
spec:
clusterName: testcluster1
replicas: 1
selector:
matchLabels: null
template:
spec:
bootstrap:
configRef:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: K0sWorkerConfigTemplate
name: testcluster1-md-0
clusterName: testcluster1
failureDomain: {AZ}
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha7
kind: OpenStackMachineTemplate
name: testcluster1-md-0
version: v1.27.8
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha7
kind: OpenStackMachineTemplate
metadata:
name: testcluster1-md-0
namespace: testcluster
spec:
template:
spec:
cloudName: {CLOUDNAME}
flavor: {FLAVOR}
identityRef:
kind: Secret
name: testcluster1-cloud-config
image: {IMAGE}
sshKeyName: {KEYNAME}
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: K0sWorkerConfigTemplate
metadata:
name: testcluster1-md-0
namespace: testcluster
spec:
template:
spec:
version: v1.27.8+k0s.0
Any pointers/help would be greatly appreciated!
However the openstack worker machine is never spawned. The credentials are correct since CAPO logs the check, but CAPO never reaches the point where it actually boots up any machines. It justs sits there without any errors and waits.
This sounds like it's waiting on some object status to get ready.
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha7
kind: OpenStackCluster
metadata:
name: testcluster1
namespace: testcluster
annotations:
cluster.x-k8s.io/managed-by: k0smotron # <-- this is bit of a suspect for me
I haven't used CAPO provider, thus I'm not sure if you HAVE to use that annotation. In case of AWS we use that to prevent the CAPA from provisioning stuff it doesn't need to as it does not have all the options to disable provisioning e.g. ctrl LB etc.
What this annotation also means (AFAIK) is that CAPO will NOT provision anything on the infra. It also has the side effect of the OpenStackCluster/testcluster1
object never getting to ready status. In AWS case it's the same, thus one needs to manually patch it ready:
kubectl patch OpenStackCluster testcluster1 -n testcluster --type=merge --subresource status --patch 'status: {ready: true}'
Thank you so much for your input. You were right, and the cluster.x-k8s.io/managed-by: k0smotron
did stop CAPO from provisioning any machines. After removing the line, CAPO successfully schedules the machines and they provision into the cluster. A big step forward!
Sadly there still is a small issue. The clusterctl output stays in WaitingForAvailableMachines
, allthough everything seems to be provisioned fine. Usually this message indicates, that no CNI is installed in the cluster. As far as I am aware k0s and therewith k0smotron should ship with kuberouter + calico out of the box? The kube-router pod is started and running on the worker node. Any idea why this happens?
clusterctl describe cluster testcluster1 -n testcluster
NAME READY SEVERITY REASON SINCE MESSAGE
Cluster/testcluster1 True 16m
├─ClusterInfrastructure - OpenStackCluster/testcluster1
├─ControlPlane - K0smotronControlPlane/testcluster1
└─Workers
└─MachineDeployment/testcluster1-md-0 False Warning WaitingForAvailableMachines 18m Minimum availability requires 1 replicas, current 0 available
└─Machine/testcluster1-md-0-j9jh2-f2292 True 15m
└─BootstrapConfig - K0sWorkerConfig/testcluster1-md-0-xkj9w
For anybody finding this issue later on: here is the newly posted cluster-template.yaml. Note that I had to expand the fields for the openstack-cluster giving an empty apiServerFixedIP
because the deployment would fail without it (k0smotron tries to patch the LoadBalancer IP into the cluster but the spec is immutable).
apiVersion: v1
kind: Namespace
metadata:
name: testcluster # Namespace in which we want to deploy the child cluster
spec: {}
status: {}
---
apiVersion: v1
data:
cacert: {b64CERT}
clouds.yaml: {b64CLOUDS}
kind: Secret
metadata:
labels:
clusterctl.cluster.x-k8s.io/move: "true"
name: testcluster1-cloud-config
namespace: testcluster
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: testcluster1 # Cluster Name
namespace: testcluster
spec:
clusterNetwork:
pods:
cidrBlocks: [10.244.0.0/16]
services:
cidrBlocks: [10.96.0.0/12]
controlPlaneRef:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: K0smotronControlPlane
name: testcluster1 # Reference to the k0s control plane we are creating in the management cluster
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha7
kind: OpenStackCluster
name: testcluster1 # Reference to the HetznerCluster CRD that trigger the Hetzner Controller Manager
---
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: K0smotronControlPlane
metadata:
name: testcluster1 # Cluster Name
namespace: testcluster
spec:
k0sVersion: v1.27.8-k0s.0 # https://github.com/k0sproject/k0s/releases
persistence:
type: emptyDir
service:
type: LoadBalancer
apiPort: 6443
konnectivityPort: 8132
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha7
kind: OpenStackCluster
metadata:
name: testcluster1
namespace: testcluster
spec:
apiServerLoadBalancer:
enabled: false
disableAPIServerFloatingIP: true
apiServerFixedIP: ""
cloudName: {CLOUDNAME}
dnsNameservers:
- {NAMESERVERS}
externalNetworkId: {EXTERNALNET}
identityRef:
kind: Secret
name: testcluster1-cloud-config
managedSecurityGroups: true
nodeCidr: 10.8.0.0/20
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
name: testcluster1-md-0
namespace: testcluster
spec:
clusterName: testcluster1
replicas: 1
selector:
matchLabels: null
template:
spec:
bootstrap:
configRef:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: K0sWorkerConfigTemplate
name: testcluster1-md-0
clusterName: testcluster1
failureDomain: {AZ}
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha7
kind: OpenStackMachineTemplate
name: testcluster1-md-0
version: v1.27.8
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha7
kind: OpenStackMachineTemplate
metadata:
name: testcluster1-md-0
namespace: testcluster
spec:
template:
spec:
cloudName: {CLOUDNAME}
flavor: {FLAVOR}
identityRef:
kind: Secret
name: testcluster1-cloud-config
image: {IMAGE}
sshKeyName: {KEYNAME}
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: K0sWorkerConfigTemplate
metadata:
name: testcluster1-md-0
namespace: testcluster
spec:
template:
spec:
version: v1.27.8+k0s.0
Did I understand correctly, the machine is up and running but the machine deployment never gets fully ready? If that is the case,, then setting up cloud controller in the child cluster should fix it. Unfortunately it is 100% required by CAPI itself as it will use the CCM set provider ID to maps the child cluster node object to the machine object. This is also not so we'll documented on CAPI docs.
Thanks again for your help!
Yeah I thought that this might be the issue. However I have not yet figured out how to install the openstack cloud controller manager into the child cluster.
Since k0smotron runs the control plane in a container, the CCM can not spawn on the controlplane, which is what it usually does. Should it instead just spawn on the worker nodes?
The other issue I found, was the preparation of the workers. To be able to use the CCM, the workers have to be started with certain install flags (see k0s docs here: https://docs.k0sproject.io/v1.28.4+k0s.0/cloud-providers/#enable-cloud-provider-support-in-kubelet). However I could not find any pointers how to configure that from within cluster-api (e.g. in the bootstrap config).
Is there any documentation on installing a CCM into a child cluster?
Here's an example with some other non-Openstack CCM:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: K0smotronControlPlane
metadata:
name: autoscaler-test
namespace: autoscaler-test
spec:
k0sVersion: v1.27.4-k0s.0
persistence:
type: pvc
persistentVolumeClaim:
spec:
storageClassName: hcloud-volumes
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
service:
type: LoadBalancer
apiPort: 6443
konnectivityPort: 8132
annotations:
load-balancer.hetzner.cloud/location: fsn1
k0sConfig:
apiVersion: k0s.k0sproject.io/v1beta1
kind: ClusterConfig
spec:
extensions:
helm:
repositories:
- name: hcloud
url: https://charts.hetzner.cloud
charts:
- name: hccm
chartName: hcloud/hcloud-cloud-controller-manager
namespace: kube-system
version: v1.18.0
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: K0sWorkerConfigTemplate
metadata:
name: as-hetzner-mc
namespace: autoscaler-test
spec:
template:
spec:
version: v1.27.4+k0s.0
args:
- --enable-cloud-provider
- --kubelet-extra-args="--cloud-provider=external"
@jnummelin thank you so much for your support.
With your help and the yaml examples I managed to get the CCM installed and indeed the cluster looks to be completely healthy now! This makes k0smotron a strong contender for us, thanks to your help.