FYI - Simple remedy system designed for use with NPD
negz opened this issue · 6 comments
Hello,
I wanted to bring Draino to your attention, in case it's useful to others. Draino is a very simple 'remedy' system for permanent problems detected by the Node Problem Detector - it simply cordons and drains nodes exhibiting configurable Node Conditions.
At Planet we run a small handful of Kubernetes clusters on GCE (not GKE). We have a particular analytics workload that is really good at killing GCE persistent volumes. Without going into too much detail, we see persistent volume related processes (mkfs.ext4, mount, etc) hanging forever in uninterruptible sleep, preventing the pods wanting to consume said volumes from running. We're working with GCP to resolve this issue, but in the meantime we got tired of manually cordoning and draining affected nodes, so we wrote Draino.
Our remedy system looks like:
- Detect permanent node problems and set Node Conditions using the Node Problem Detector.
- Configure Draino to cordon and drain nodes when they exhibit the NPD's
KernelDeadlockcondition, or a variant ofKernelDeadlockwe callVolumeTaskHung. - Let the Cluster Autoscaler scale down underutilised nodes, including the nodes Draino has drained.
It's worth noting that once the Descheduler supports descheduling pods based on taints Draino could be replaced by the Descheduler running in combination with the scheduler's TaintNodesByCondition functionality.
@negz This is a quite good use case for NPD. Will learn about what you said detailedly later. Would you mind to add your use case of NPD to the usage case section in ReadMe.
This is quite what NPD is first proposed to do. Because the remedy system is end user dependent, common remedy system is not so easily developed.
@andyxning Thanks! I'd be happy to mention this use case in the README. Would it be too self-promotional to link to our Draino tool there? :)
@negz No. Draino is actually an POC of a remedy system based on NPD. :)
Could you please make a PR to add the use case?
@negz I have read Draino code briefly. It seems quite good and absolutely worth a use case of NPD. Please do not hesitate to add the Draino use case. I am willing to review it. :)
Hello, using draino, the permanent problem detected by the node problem detector -- it simply blocks and drains the node that behaves as a drainable node condition,
When a node appears an NPD's kernel deadlock condition, or a variant of a kernel deadlock known as VolumeTaskHung, configuring drain to lock and drain nodes
Here is my example shown below. I blocked for more than 300 seconds by echoing an echo "task docker:7 SEC." | systemd-cat-t kernel
The drain causes my kernel error and the rule KernelDeadlock True, but the draino doesn't work together, making my node set as unschedulable. Is this the wrong item
This is my runtime environment
# kubectl get po -A |egrep 'node-problem-detector|draino'
kube-system draino-58fc699f84-br2m2 1/1 Running 0 17m
kube-system node-problem-detector-smjw7 1/1 Running 0 18m
My KernelDeadlock True has triggered the rule, but the draino seems to drain
# for node in `kubectl get node |sed '1d' |awk '{print $1}'`;do kubectl describe node $node |sed -n '/Conditions/,/Ready/p' ;done
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
KernelDeadlock True Sun, 30 Aug 2020 13:49:54 +0800 Sun, 30 Aug 2020 13:39:52 +0800 DockerHung task docker:7 blocked for more than 300 seconds.
NetworkUnavailable False Tue, 25 Aug 2020 13:39:47 +0800 Tue, 25 Aug 2020 13:39:47 +0800 CalicoIsUp Calico is running on this node
MemoryPressure False Sun, 30 Aug 2020 13:49:54 +0800 Tue, 25 Aug 2020 13:39:10 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Sun, 30 Aug 2020 13:49:54 +0800 Tue, 25 Aug 2020 13:39:10 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Sun, 30 Aug 2020 13:49:54 +0800 Tue, 25 Aug 2020 13:39:10 +0800 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Sun, 30 Aug 2020 13:49:54 +0800 Thu, 27 Aug 2020 07:17:14 +0800 KubeletReady kubelet is posting ready status
The draino isn't working, setting my node to unlarged and expelling my pod
# kubectl get events -n kube-system | grep -E '(^LAST|draino)'
LAST SEEN TYPE REASON OBJECT MESSAGE
<unknown> Normal Scheduled pod/draino-58fc699f84-br2m2 Successfully assigned kube-system/draino-58fc699f84-br2m2 to master
18m Normal Pulling pod/draino-58fc699f84-br2m2 Pulling image "planetlabs/draino:5e07e93"
18m Normal Pulled pod/draino-58fc699f84-br2m2 Successfully pulled image "planetlabs/draino:5e07e93"
18m Normal Created pod/draino-58fc699f84-br2m2 Created container draino
18m Normal Started pod/draino-58fc699f84-br2m2 Started container draino
18m Normal SuccessfulCreate replicaset/draino-58fc699f84 Created pod: draino-58fc699f84-br2m2
18m Normal ScalingReplicaSet deployment/draino Scaled up replica set draino-58fc699f84 to 1
# kubectl get no
NAME STATUS ROLES AGE VERSION
master Ready master 5d v1.18.0