ax51-nvme: kubectl get nodes reports x509: certificate has expired or is not yet valid
Closed this issue · 15 comments
Issue
We cannot anymore access the Kubernetes cluster on hetzner's VM - ax51-nvm
pass hetzner/ax51-nvme/ansible_user
root
pass hetzner/ax51-nvme/ansible_ssh_host
195.201.87.126
ssh root@195.201.87.126
alias k=kubectl
CentOS-77-64-minimal:~$ k get nodes
Unable to connect to the server: x509: certificate has expired or is not yet valid: current time 2022-06-20T10:31:49+02:00 is after 2022-05-17T12:52:17Z
This is confirmed using the following kubeadm command
kubeadm alpha certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[check-expiration] Error reading configuration from the Cluster. Falling back to default configuration
W0620 10:38:07.551570 23086 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf May 17, 2022 12:53 UTC <invalid> no
apiserver May 17, 2022 12:52 UTC <invalid> ca no
apiserver-etcd-client May 17, 2022 12:52 UTC <invalid> etcd-ca no
apiserver-kubelet-client May 17, 2022 12:52 UTC <invalid> ca no
controller-manager.conf May 17, 2022 12:52 UTC <invalid> no
etcd-healthcheck-client May 17, 2022 12:51 UTC <invalid> etcd-ca no
etcd-peer May 17, 2022 12:51 UTC <invalid> etcd-ca no
etcd-server May 17, 2022 12:51 UTC <invalid> etcd-ca no
front-proxy-client May 17, 2022 12:52 UTC <invalid> front-proxy-ca no
scheduler.conf May 17, 2022 12:52 UTC <invalid> no
CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
ca Dec 16, 2029 14:47 UTC 7y no
etcd-ca Dec 16, 2029 14:47 UTC 7y no
front-proxy-ca Jan 19, 2031 11:55 UTC 8y no
Solution
I suggest to manually renew the certificate https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/#manual-certificate-renewal
WDYT ?
Certificate has been renewed
CentOS-77-64-minimal:~$ kubeadm alpha certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Jun 20, 2023 09:27 UTC 364d no
NOTE: We should certainly think about having an automatic rotation of the certificate every year OR to extend the EOF date f the certificate on this cluster ? WDYT ? @jacobdotcosta
The certificates are renewed but now kubelet isn't starting.
failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory
Since the bootstrap-kubelet.conf
should be used only when /etc/kubernetes/kubelet.conf
doesn't exist, and /etc/kubernetes/kubelet.conf
does exist, fixed it with cp /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf
NOTE: We should certainly think about having an automatic rotation of the certificate every year OR to extend the EOF date f the certificate on this cluster ? WDYT ? @jacobdotcosta
Can you create a new ticket concerning this point as we should take care ? @jacobdotcosta
Some resources aren't accesible yet.
Although the POD are up the snowdrop site and the team report pages aren't available.
The POD ans services seem correct.
$ kubectl -n snowdrop-site get all
NAME READY STATUS RESTARTS AGE
pod/snowdrop-site-angular-774dd56856-bqvjh 1/1 Running 0 399d
pod/spring-boot-generator-6587865b98-rdglj 1/1 Running 0 399d
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/snowdrop-site-angular ClusterIP 10.100.108.193 <none> 80/TCP 2y183d
service/spring-boot-generator ClusterIP 10.106.170.102 <none> 80/TCP 2y162d
Although it's not possible to fetch the logs.
$ kubectl -n snowdrop-site logs -f snowdrop-site-angular-774dd56856-bqvjh Error from server (InternalError): Internal error occurred: Authorization error (user=kube-apiserver-kubelet-client, verb=get, resource=nodes, subresource=proxy)
Using log level 10
on the request shows an 500
error on the fetch.
curl -k -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kubectl/v1.19.11 (linux/amd64) kubernetes/c6a2f08" 'https://xxx.xxx.xxx.xxx:6443/api/v1/namespaces/snowdrop-site/pods/snowdrop-site-angular-774dd56856-bqvjh/log?follow=true'
I0620 15:04:13.199083 2704 round_trippers.go:444] GET https://xxx.xxx.xxx.xxx:6443/api/v1/namespaces/snowdrop-site/pods/snowdrop-site-angular-774dd56856-bqvjh/log?follow=true 500 Internal Server Error in 4 milliseconds
Node is in state NotReady
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
centos-77-64-minimal NotReady master 2y184d v1.19.11
$ kubectl describe node centos-77-64-minimal
...
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Mon, 17 May 2021 11:36:17 +0200 Mon, 17 May 2021 11:36:17 +0200 FlannelIsUp Flannel is running on this node
MemoryPressure Unknown Mon, 20 Jun 2022 11:09:04 +0200 Mon, 20 Jun 2022 12:03:30 +0200 NodeStatusUnknown Kubelet stopped posting node status.
DiskPressure Unknown Mon, 20 Jun 2022 11:09:04 +0200 Mon, 20 Jun 2022 12:03:30 +0200 NodeStatusUnknown Kubelet stopped posting node status.
PIDPressure Unknown Mon, 20 Jun 2022 11:09:04 +0200 Mon, 20 Jun 2022 12:03:30 +0200 NodeStatusUnknown Kubelet stopped posting node status.
Ready Unknown Mon, 20 Jun 2022 11:09:04 +0200 Mon, 20 Jun 2022 12:03:30 +0200 NodeStatusUnknown Kubelet stopped posting node status.
# journalctl -xef -u kubelet
...
Jun 20 18:37:07 CentOS-77-64-minimal kubelet[2206]: I0620 18:37:07.571210 2206 kubelet_node_status.go:71] Attempting to register node centos-77-64-minimal
Jun 20 18:37:07 CentOS-77-64-minimal kubelet[2206]: E0620 18:37:07.573224 2206 kubelet_node_status.go:93] Unable to register node "centos-77-64-minimal" with API server: nodes is forbidden: User "system:anonymous" cannot create resource "nodes" in API group "" at the cluster scope
...
API server: nodes is forbidden: User "system:anonymous" cannot create resource "nodes" in API group "" at the cluster scope
So the problem is related to API server: nodes is forbidden: User "system:anonymous" cannot create resource "nodes" in API group "" at the cluster scope
. It seems that the certificate used by kubelet is associated with the anonymous's role and not the admin's role
The kubelet pem
file is outdated and unusble after the certificate renovation.
# ll /var/lib/kubelet/pki
total 32
drwxr-xr-x 2 root root 4096 Jun 20 2021 ./
drwx------ 9 root root 4096 May 17 2021 ../
-rw------- 1 root root 2778 Dec 19 2019 kubelet-client-2019-12-19-15-47-45.pem
-rw------- 1 root root 1131 Dec 19 2019 kubelet-client-2019-12-19-15-48-12.pem
-rw------- 1 root root 1131 Sep 2 2020 kubelet-client-2020-09-02-09-34-13.pem
-rw------- 1 root root 1082 Jun 20 2021 kubelet-client-2021-06-20-11-14-20.pem
lrwxrwxrwx 1 root root 59 Jun 20 2021 kubelet-client-current.pem -> /var/lib/kubelet/pki/kubelet-client-2021-06-20-11-14-20.pem
-rw-r--r-- 1 root root 2245 Dec 19 2019 kubelet.crt
-rw------- 1 root root 1675 Dec 19 2019 kubelet.key
Solution is perhaps to perform this
You need to provide --kubelet-client-certificate=<path_to_cert> and --kubelet-client-key=<path_to_key> to your apiserver, this way apiserver authenticate the kubelet with the certficate and key pair.
BK_DATE_STR=20220621
cd /etc/kubernetes/pki/
ls -la
mkdir _bk_${BK_DATE_STR}
mv {apiserver.crt,apiserver-etcd-client.key,apiserver-kubelet-client.crt,front-proxy-ca.crt,front-proxy-client.crt,front-proxy-client.key,front-proxy-ca.key,apiserver-kubelet-client.key,apiserver.key,apiserver-etcd-client.crt} _bk_${BK_DATE_STR}/
ls -la
kubeadm init phase certs all --apiserver-advertise-address <API SERVER IP ADDRESS>
ls -la
cd /etc/kubernetes/
mkdir _bk_${BK_DATE_STR}
mv {admin.conf,controller-manager.conf,kubelet.conf,scheduler.conf} _bk_${BK_DATE_STR}/
kubeadm init phase kubeconfig all
ls -la
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
As a result, the node is noew Ready
.
$ kc get nodes
NAME STATUS ROLES AGE VERSION
centos-77-64-minimal Ready master 2y184d v1.19.11