k0sctl init doesn't come to a working cluster
RobertMirantis opened this issue · 5 comments
k0sctl init > cluster.yaml
apiVersion: k0sctl.k0sproject.io/v1beta1
kind: Cluster
metadata:
name: k0s-cluster
spec:
hosts:
- ssh:
address: 10.0.0.1
user: root
port: 22
keyPath: null
role: controller
- ssh:
address: 10.0.0.2
user: root
port: 22
keyPath: null
role: worker
k0s:
version: 1.28.5+k0s.0
dynamicConfig: false
Is missing all the k0s configuration pieces.
The cluster gets created successfully (on AWS ubuntu images).
But when deploying a simple nginx configuration the nginx pod succesfully starts but you can't connect to it or see logs.
It looks promising but actually doesn't operate (at all!)
STEP 1: Create a template file
[ec2-user@ip-172-31-21-103 scripts]$ k0sctl init > cluster2.yaml
[ec2-user@ip-172-31-21-103 scripts]$ cat cluster2.yaml
apiVersion: k0sctl.k0sproject.io/v1beta1
kind: Cluster
metadata:
name: k0s-cluster
spec:
hosts:
- ssh:
address: 10.0.0.1
user: root
port: 22
keyPath: null
role: controller
- ssh:
address: 10.0.0.2
user: root
port: 22
keyPath: null
role: worker
k0s:
version: 1.28.5+k0s.0
dynamicConfig: false
STEP 2: Fill in the addresses, keyPath and user.
STEP 3: Run k0sctl
[ec2-user@ip-172-31-21-103 scripts]$ k0sctl apply --config cluster2.yaml
⠀⣿⣿⡇⠀⠀⢀⣴⣾⣿⠟⠁⢸⣿⣿⣿⣿⣿⣿⣿⡿⠛⠁⠀⢸⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⠀█████████ █████████ ███
⠀⣿⣿⡇⣠⣶⣿⡿⠋⠀⠀⠀⢸⣿⡇⠀⠀⠀⣠⠀⠀⢀⣠⡆⢸⣿⣿⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀███ ███ ███
⠀⣿⣿⣿⣿⣟⠋⠀⠀⠀⠀⠀⢸⣿⡇⠀⢰⣾⣿⠀⠀⣿⣿⡇⢸⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⠀███ ███ ███
⠀⣿⣿⡏⠻⣿⣷⣤⡀⠀⠀⠀⠸⠛⠁⠀⠸⠋⠁⠀⠀⣿⣿⡇⠈⠉⠉⠉⠉⠉⠉⠉⠉⢹⣿⣿⠀███ ███ ███
⠀⣿⣿⡇⠀⠀⠙⢿⣿⣦⣀⠀⠀⠀⣠⣶⣶⣶⣶⣶⣶⣿⣿⡇⢰⣶⣶⣶⣶⣶⣶⣶⣶⣾⣿⣿⠀█████████ ███ ██████████
k0sctl v0.15.4 Copyright 2023, k0sctl authors.
Anonymized telemetry of usage will be sent to the authors.
By continuing to use k0sctl you agree to these terms:
https://k0sproject.io/licenses/eula
WARN An old cache directory still exists at /home/ec2-user/.k0sctl/cache, k0sctl now uses /home/ec2-user/.cache/k0sctl
INFO ==> Running phase: Connect to hosts
INFO [ssh] 172.31.26.180:22: connected
INFO [ssh] 172.31.23.3:22: connected
INFO [ssh] 172.31.21.101:22: connected
INFO [ssh] 172.31.20.207:22: connected
INFO [ssh] 172.31.23.87:22: connected
INFO [ssh] 172.31.22.124:22: connected
INFO ==> Running phase: Detect host operating systems
INFO [ssh] 172.31.20.207:22: is running Ubuntu 18.04.5 LTS
INFO [ssh] 172.31.23.87:22: is running Ubuntu 18.04.5 LTS
INFO [ssh] 172.31.23.3:22: is running Ubuntu 18.04.5 LTS
INFO [ssh] 172.31.22.124:22: is running Ubuntu 18.04.5 LTS
INFO [ssh] 172.31.26.180:22: is running Ubuntu 18.04.5 LTS
INFO [ssh] 172.31.21.101:22: is running Ubuntu 18.04.5 LTS
INFO ==> Running phase: Acquire exclusive host lock
INFO ==> Running phase: Prepare hosts
INFO ==> Running phase: Gather host facts
INFO [ssh] 172.31.23.87:22: using ip-172-31-23-87 as hostname
INFO [ssh] 172.31.26.180:22: using ip-172-31-26-180 as hostname
INFO [ssh] 172.31.22.124:22: using ip-172-31-22-124 as hostname
INFO [ssh] 172.31.23.3:22: using ip-172-31-23-3 as hostname
INFO [ssh] 172.31.21.101:22: using ip-172-31-21-101 as hostname
INFO [ssh] 172.31.20.207:22: using ip-172-31-20-207 as hostname
INFO [ssh] 172.31.22.124:22: discovered ens5 as private interface
INFO [ssh] 172.31.23.3:22: discovered ens5 as private interface
INFO [ssh] 172.31.23.87:22: discovered ens5 as private interface
INFO [ssh] 172.31.20.207:22: discovered ens5 as private interface
INFO [ssh] 172.31.26.180:22: discovered ens5 as private interface
INFO [ssh] 172.31.21.101:22: discovered ens5 as private interface
INFO ==> Running phase: Validate hosts
INFO ==> Running phase: Gather k0s facts
INFO ==> Running phase: Validate facts
INFO ==> Running phase: Configure k0s
WARN [ssh] 172.31.20.207:22: generating default configuration
INFO [ssh] 172.31.23.3:22: validating configuration
INFO [ssh] 172.31.20.207:22: validating configuration
INFO [ssh] 172.31.22.124:22: validating configuration
INFO [ssh] 172.31.23.3:22: configuration was changed
INFO [ssh] 172.31.20.207:22: configuration was changed
INFO [ssh] 172.31.22.124:22: configuration was changed
INFO ==> Running phase: Initialize the k0s cluster
INFO [ssh] 172.31.20.207:22: installing k0s controller
INFO [ssh] 172.31.20.207:22: waiting for the k0s service to start
INFO [ssh] 172.31.20.207:22: waiting for kubernetes api to respond
INFO ==> Running phase: Install controllers
INFO [ssh] 172.31.20.207:22: generating token
INFO [ssh] 172.31.23.3:22: writing join token
INFO [ssh] 172.31.23.3:22: installing k0s controller
INFO [ssh] 172.31.23.3:22: starting service
INFO [ssh] 172.31.23.3:22: waiting for the k0s service to start
INFO [ssh] 172.31.23.3:22: waiting for kubernetes api to respond
INFO [ssh] 172.31.20.207:22: generating token
INFO [ssh] 172.31.22.124:22: writing join token
INFO [ssh] 172.31.22.124:22: installing k0s controller
INFO [ssh] 172.31.22.124:22: starting service
INFO [ssh] 172.31.22.124:22: waiting for the k0s service to start
INFO [ssh] 172.31.22.124:22: waiting for kubernetes api to respond
INFO ==> Running phase: Install workers
INFO [ssh] 172.31.23.87:22: validating api connection to https://172.31.20.207:6443
INFO [ssh] 172.31.21.101:22: validating api connection to https://172.31.20.207:6443
INFO [ssh] 172.31.26.180:22: validating api connection to https://172.31.20.207:6443
INFO [ssh] 172.31.20.207:22: generating token
INFO [ssh] 172.31.23.87:22: writing join token
INFO [ssh] 172.31.26.180:22: writing join token
INFO [ssh] 172.31.21.101:22: writing join token
INFO [ssh] 172.31.26.180:22: installing k0s worker
INFO [ssh] 172.31.21.101:22: installing k0s worker
INFO [ssh] 172.31.23.87:22: installing k0s worker
INFO [ssh] 172.31.26.180:22: starting service
INFO [ssh] 172.31.26.180:22: waiting for node to become ready
INFO [ssh] 172.31.21.101:22: starting service
INFO [ssh] 172.31.23.87:22: starting service
INFO [ssh] 172.31.21.101:22: waiting for node to become ready
INFO [ssh] 172.31.23.87:22: waiting for node to become ready
INFO ==> Running phase: Release exclusive host lock
INFO ==> Running phase: Disconnect from hosts
INFO ==> Finished in 1m4s
INFO k0s cluster version v1.28.4+k0s.0 is now installed
INFO Tip: To access the cluster you can now fetch the admin kubeconfig using:
INFO k0sctl kubeconfig
--> SEE HOW EVERYTHING LOOKS BEAUTIFUL
Step 4: Get a kubeconfig
k0sctl kubeconfig --config cluster2.yaml
Step 5: Look into the cluster:
[ec2-user@ip-172-31-21-103 scripts]$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-85df575cdb-b2mp4 0/1 Running 5 (65s ago) 8m49s
kube-system coredns-85df575cdb-l6z8f 0/1 Running 5 (28s ago) 8m43s
kube-system konnectivity-agent-64scx 1/1 Running 6 (2m4s ago) 8m42s
kube-system konnectivity-agent-n6gvz 1/1 Running 6 (2m2s ago) 8m48s
kube-system konnectivity-agent-q7hfn 1/1 Running 6 8m48s
kube-system kube-proxy-hkzd2 1/1 Running 0 8m48s
kube-system kube-proxy-m8lxn 1/1 Running 0 8m42s
kube-system kube-proxy-mqg9w 1/1 Running 0 8m48s
kube-system kube-router-dxjdv 1/1 Running 0 8m48s
kube-system kube-router-nplz2 1/1 Running 0 8m48s
kube-system kube-router-wvhzr 1/1 Running 0 8m41s
kube-system metrics-server-7556957bb7-lpqr7 0/1 CrashLoopBackOff 6 (2m26s ago) 8m49s
Not working...
Step 6: (if needed because above has no good news)
[ec2-user@ip-172-31-21-103 scripts]$ kubectl create deployment mydep --image=nginx --replicas=3
deployment.apps/mydep created
[ec2-user@ip-172-31-21-103 scripts]$ kubectl get pods
NAME READY STATUS RESTARTS AGE
mydep-66c55fb688-cbkgl 1/1 Running 0 10s
mydep-66c55fb688-nbhtg 1/1 Running 0 10s
mydep-66c55fb688-pbjpl 1/1 Running 0 10s
Again, looks promising.
[ec2-user@ip-172-31-21-103 scripts]$ kubectl logs mydep-66c55fb688-cbkgl
Error from server: Get "https://172.31.21.101:10250/containerLogs/default/mydep-66c55fb688-cbkgl/nginx": No agent available
[ec2-user@ip-172-31-21-103 scripts]$ kubectl exec mydep-66c55fb688-cbkgl -it -- bash
Error from server: error dialing backend: No agent available
But they are running, but not working!
FINAL REMARKS:
- This is NOT an issue when the k0sctl config file is created with the --k0s option
- https://docs.k0sproject.io/v1.28.5+k0s.0/k0sctl-install/ versus https://github.com/k0sproject/k0sctl/blob/main/README.md#installation documentation makes the difference in result
- Why is there NO test if the cluster is properly working in the end of k0sctl
For reference, here is what the k0s
section looks like with --k0s
:
k0s:
config:
apiVersion: k0s.k0sproject.io/v1beta1
kind: Cluster
metadata:
name: k0s
spec:
api:
k0sApiPort: 9443
port: 6443
installConfig:
users:
etcdUser: etcd
kineUser: kube-apiserver
konnectivityUser: konnectivity-server
kubeAPIserverUser: kube-apiserver
kubeSchedulerUser: kube-scheduler
konnectivity:
adminPort: 8133
agentPort: 8132
network:
kubeProxy:
disabled: false
mode: iptables
kuberouter:
autoMTU: true
mtu: 0
peerRouterASNs: ""
peerRouterIPs: ""
podCIDR: 10.244.0.0/16
provider: kuberouter
serviceCIDR: 10.96.0.0/12
podSecurityPolicy:
defaultPolicy: 00-k0s-privileged
storage:
type: etcd
telemetry:
enabled: true
When k0s.config
is not set, k0sctl will use the output of k0s config create
to create a default config.
This would then mean that the default config of k0s does not work, which I don't think is likely.
There seems to be some networking problem. The konnectivity agents have some troubles to connect to the k0s controller. I guess CoreDNS and metrics-server are failing for similar reasons. Can you check the pod logs directly on the worker node? They should be in /var/log/containers/konnectivity-agent-*.log
, /var/log/containers/coredns-*.log
and so on.
Something has changed. Same terraform scripts (running a k0sctl init without --k0s ) is now working fine.