Provisioning a cluster on Hetzner with debian 12 images fails
Closed this issue · 0 comments
/kind bug
1. What kops
version are you running? The command kops version
, will display
this information.
1.28.4
2. What Kubernetes version are you running? kubectl version
will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops
flag.
Client Version: v1.28.1
Server Version: v1.28.6
3. What cloud provider are you using?
Hetzner
4. What commands did you run? What is the simplest way to reproduce this issue?
kops create cluster --name=test-cluster
--ssh-public-key=~/.ssh/kops_rsa.pub --cloud=hetzner --node-count=3 --zones=nbg1
--image=debian-12 --control-plane-count=3 --networking=cilium --network-cidr=10.10.0.0/16
--node-size cx21
5. What happened after the commands executed?
The command ended successfully, however the control plane was not able to start properly. kubelet service was failing repeatedly with the following message:
Could not open resolv conf file." err="open /run/systemd/resolve/resolv.conf: no such file or directory
6. What did you expect to happen?
I expect the control planes to start, so that the cluster can be used.
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml
to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.
apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
creationTimestamp: "2024-04-17T14:03:00Z"
name: test-cluster
spec:
api:
loadBalancer:
type: Public
authorization:
rbac: {}
channel: stable
cloudProvider: hetzner
configBase: s3://test-cluster/test-cluster
etcdClusters:
- cpuRequest: 200m
etcdMembers:
- instanceGroup: control-plane-nbg1-1
name: etcd-1
- instanceGroup: control-plane-nbg1-2
name: etcd-2
- instanceGroup: control-plane-nbg1-3
name: etcd-3
manager:
backupRetentionDays: 90
memoryRequest: 100Mi
name: main
- cpuRequest: 100m
etcdMembers:
- instanceGroup: control-plane-nbg1-1
name: etcd-1
- instanceGroup: control-plane-nbg1-2
name: etcd-2
- instanceGroup: control-plane-nbg1-3
name: etcd-3
manager:
backupRetentionDays: 90
memoryRequest: 100Mi
name: events
iam:
allowContainerRegistry: true
legacy: false
kubeProxy:
enabled: false
kubelet:
anonymousAuth: false
kubernetesApiAccess:
- 0.0.0.0/0
- ::/0
kubernetesVersion: 1.28.6
networkCIDR: 10.10.0.0/16
networking:
cilium:
enableNodePort: true
nonMasqueradeCIDR: 100.64.0.0/10
sshAccess:
- 0.0.0.0/0
- ::/0
subnets:
- name: nbg1
type: Public
zone: nbg1
topology:
dns:
type: None
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: "2024-04-17T14:03:00Z"
labels:
kops.k8s.io/cluster: test-cluster
name: control-plane-nbg1-1
spec:
image: debian-12
machineType: cx21
maxSize: 1
minSize: 1
role: Master
subnets:
- nbg1
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: "2024-04-17T14:03:00Z"
labels:
kops.k8s.io/cluster: test-cluster
name: control-plane-nbg1-2
spec:
image: debian-12
machineType: cx21
maxSize: 1
minSize: 1
role: Master
subnets:
- nbg1
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: "2024-04-17T14:03:00Z"
labels:
kops.k8s.io/cluster: test-cluster
name: control-plane-nbg1-3
spec:
image: debian-12
machineType: cx21
maxSize: 1
minSize: 1
role: Master
subnets:
- nbg1
---
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: "2024-04-17T14:03:00Z"
labels:
kops.k8s.io/cluster: test-cluster
name: nodes-nbg1
spec:
image: debian-12
machineType: cx21
maxSize: 3
minSize: 3
role: Node
subnets:
- nbg1
8. Please run the commands with most verbose logging by adding the -v 10
flag.
Paste the logs into this report, or in a gist and provide the gist link here.
No need because this is not related to a kops command. The kops create command was successfull. The issue happens at the level of control plane nodes. An extract of the logs:
kubelet[3014]: I0414 14:42:31.100802 3014 util.go:30] "No sandbox for pod can be found. Need to start a new one" pod="kube-system/kube-apiserver-control-plane-fsn1-1-636428496a76dc30"
Apr 14 14:42:31 control-plane-fsn1-1-636428496a76dc30 kubelet[3014]: E0414 14:42:31.100877 3014 dns.go:284] "Could not open resolv conf file." err="open /run/systemd/resolve/resolv.conf: no such file or directory"
Apr 14 14:42:31 control-plane-fsn1-1-636428496a76dc30 kubelet[3014]: E0414 14:42:31.100911 3014 kuberuntime_sandbox.go:45] "Failed to generate sandbox config for pod" err="open /run/systemd/resolve/resolv.conf: no such file or directory" pod="kube-system/kube-apiserver-control-plane-fsn1-1-636428496a76dc30"
Apr 14 14:42:31 control-plane-fsn1-1-636428496a76dc30 kubelet[3014]: E0414 14:42:31.100947 3014 kuberuntime_manager.go:1171] "CreatePodSandbox for pod failed" err="open /run/systemd/resolve/resolv.conf: no such file or directory" pod="kube-system/kube
9. Anything else do we need to know?
Maybe this check needs to be adjusted here