No error message in log when ccm does not set the Provider ID?
Closed this issue · 2 comments
Hi @sp-yduck,
first really nice project(s) i follow every commit via e-mail in your proxmox project(s)🤓.
At the moment i just use your software i also try to dip into golang and especially into your code but atm i do not have the time/motivation to learn so much new stuff.
Today i tried to get up and running your newest version but i have to problem that the ccm do not set the providerID in the node.spec.providerID
. Even with CNI and CCM installed. I tried different CNIs (i don't get cilium to work "Load overlay network failed" error="program cil_from_overlay: replacing clsact qdisc for interface cilium_vxlan: operation not supported"
) but calico (good old friend) worked.
Node is healthy, all pods are up (coredns not because of the node.cloudprovider.kubernetes.io/uninitialized
taint after i remove the taint manually coredns also came up) and the log of the ccm also looks okay but i the providerID
under node.spec.
is although missing.
Below are some snippets of my setup.
I would assume a log message in the ccm log if the ccm is not able to set the providerID (or remove the taint).
Is this maybe a hint?
$ k --kubeconfig=kubeconfig.yaml logs -n kube-system kube-controller-manager-cappx-test-controlplane-gj5b9 | grep err
I0802 19:09:40.753236 1 resource_quota_monitor.go:223] "QuotaMonitor created object count evaluator" resource="controllerrevisions.apps"
I0802 19:09:50.872722 1 controllermanager.go:638] "Started controller" controller="clusterrole-aggregation"
I0802 19:09:50.872868 1 clusterroleaggregation_controller.go:189] "Starting ClusterRoleAggregator controller"
I0802 19:09:50.888903 1 actual_state_of_world.go:547] "Failed to update statusUpdateNeeded field in actual state of world" err="Failed to set statusUpdateNeeded to needed true, because nodeName=\"cappx-test-controlplane-gj5b9\" does not exist"
Some snippets of the status and logs of the cluster
$ k --kubeconfig=kubeconfig.yaml get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-68df4c59b7-przqw 1/1 Running 0 2m56s
kube-system calico-node-l25bf 1/1 Running 0 5m59s
kube-system coredns-5d78c9869d-lqhp6 0/1 Pending 0 10m
kube-system coredns-5d78c9869d-pw2q8 0/1 Pending 0 10m
kube-system etcd-cappx-test-controlplane-gj5b9 1/1 Running 1 (11m ago) 11m
kube-system kube-apiserver-cappx-test-controlplane-gj5b9 1/1 Running 1 (11m ago) 11m
kube-system kube-controller-manager-cappx-test-controlplane-gj5b9 1/1 Running 2 (5m7s ago) 11m
kube-system kube-proxy-jp476 1/1 Running 0 10m
kube-system kube-scheduler-cappx-test-controlplane-gj5b9 1/1 Running 2 (5m7s ago) 11m
kube-system kube-vip-cappx-test-controlplane-gj5b9 1/1 Running 2 (5m8s ago) 11m
$ k --kubeconfig=kubeconfig.yaml get node cappx-test-controlplane-gj5b9 -o=jsonpath={.spec}
{"podCIDR":"10.244.0.0/24","podCIDRs":["10.244.0.0/24"],"taints":[{"effect":"NoSchedule","key":"node-role.kubernetes.io/control-plane"},{"effect":"NoSchedule","key":"node.cloudprovider.kubernetes.io/uninitialized","value":"true"}]}
$ k --kubeconfig=kubeconfig.yaml get node
NAME STATUS ROLES AGE VERSION
cappx-test-controlplane-gj5b9 Ready control-plane 18m v1.27.3
$ k --kubeconfig=kubeconfig.yaml get events
LAST SEEN TYPE REASON OBJECT MESSAGE
20m Normal Starting node/cappx-test-controlplane-gj5b9 Starting kubelet.
20m Warning InvalidDiskCapacity node/cappx-test-controlplane-gj5b9 invalid capacity 0 on image filesystem
20m Normal NodeAllocatableEnforced node/cappx-test-controlplane-gj5b9 Updated Node Allocatable limit across pods
20m Normal NodeHasSufficientMemory node/cappx-test-controlplane-gj5b9 Node cappx-test-controlplane-gj5b9 status is now: NodeHasSufficientMemory
20m Normal NodeHasNoDiskPressure node/cappx-test-controlplane-gj5b9 Node cappx-test-controlplane-gj5b9 status is now: NodeHasNoDiskPressure
20m Normal NodeHasSufficientPID node/cappx-test-controlplane-gj5b9 Node cappx-test-controlplane-gj5b9 status is now: NodeHasSufficientPID
19m Normal Starting node/cappx-test-controlplane-gj5b9 Starting kubelet.
19m Warning InvalidDiskCapacity node/cappx-test-controlplane-gj5b9 invalid capacity 0 on image filesystem
19m Normal NodeHasSufficientMemory node/cappx-test-controlplane-gj5b9 Node cappx-test-controlplane-gj5b9 status is now: NodeHasSufficientMemory
19m Normal NodeHasNoDiskPressure node/cappx-test-controlplane-gj5b9 Node cappx-test-controlplane-gj5b9 status is now: NodeHasNoDiskPressure
19m Normal NodeHasSufficientPID node/cappx-test-controlplane-gj5b9 Node cappx-test-controlplane-gj5b9 status is now: NodeHasSufficientPID
19m Normal NodeAllocatableEnforced node/cappx-test-controlplane-gj5b9 Updated Node Allocatable limit across pods
19m Normal RegisteredNode node/cappx-test-controlplane-gj5b9 Node cappx-test-controlplane-gj5b9 event: Registered Node cappx-test-controlplane-gj5b9 in Controller
19m Normal Starting node/cappx-test-controlplane-gj5b9
13m Normal NodeReady node/cappx-test-controlplane-gj5b9 Node cappx-test-controlplane-gj5b9 status is now: NodeReady
12m Normal RegisteredNode node/cappx-test-controlplane-gj5b9 Node cappx-test-controlplane-gj5b9 event: Registered Node cappx-test-controlplane-gj5b9 in Controller
I would appreciate your reply very much!
it seems you haven't deployed CCM(Cloud Controller Manager) ? that log is kube-controller-manager
make sure to install ccm (https://github.com/sp-yduck/cloud-provider-proxmox/blob/master/manifests/cloud-controller-manager.yaml#L26)
damn.... 🤐
thank you!
Then i will have a look into the ClusterResourceSet it seems like the resources aren't applied.
Feel free to close the issue otherwise i will post a update on it.