allow setting label to nodes about to be upgraded/restarted
ibotty opened this issue · 8 comments
Description
Because there is no agreed-upon way to signal operators that a node is drained, there are multiple ways that operators handle it.
Rook detects node drain by observing pods on the node. This works fine but feels a bit fragile.
The problem is that some operators (e.g. the Zalando PostgreSQL Operator) "detect" drains by watching node's labels. Whenever a label is not set anymore (e.g. "node-ready=true") it will (try to) failover to another DB pod on another node.
This is a feature request to update node's labels when a reboot is about to happen.
Steps to reproduce the issue:
- update some machineconfig,
- observe machine-config-daemon trying to drain a node,
- failing to drain the node because there is a pdb on a pod on that node,
meanwhile
4. some operator not knowing that the machine is about to be rebooted and not updating the pdb (directly or indirectly.)
- the node not getting drained.
Describe the results you expected:
- update some machineconfig,
- machine-config-daemon updating label
machineconfiguration.openshift.io/pending-restart=false
to=true
,
3a. an operator removes active workload from the node, removing/updating pdbs that affect the node,
3b. machine-config-daemon drains the node, - node reboots successful,
- machine-config-daemon sets label
machineconfiguration.openshift.io/pending-restart=false
.
Hi, thanks for filing this!
This issue relates to a topic of reboot handling that's ongoing, for which most information/discussion is (AFAIK) sadly trapped in internal-to-RH proprietary systems because staying open requires relentless commitment and we aren't consistent about that.
machine-config-daemon updating label machineconfiguration.openshift.io/pending-restart=false to =true
I think we should avoid having OpenShift/MCO-specific labels here; we want to interoperate with the rest of the Kubernetes ecosystem.
Rook detects node drain by observing pods on the node. This works fine but feels a bit fragile.
Note that https://kubernetes.io/docs/concepts/architecture/nodes/#graceful-node-shutdown will make this more reliable and we (OCP) plan to roll that out.
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle rotten
/remove-lifecycle stale
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting /reopen
.
Mark the issue as fresh by commenting /remove-lifecycle rotten
.
Exclude this issue from closing again by commenting /lifecycle frozen
.
/close
@openshift-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting
/reopen
.
Mark the issue as fresh by commenting/remove-lifecycle rotten
.
Exclude this issue from closing again by commenting/lifecycle frozen
./close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Still relevant.
And reading https://kubernetes.io/docs/concepts/architecture/nodes/#graceful-node-shutdown another time, I don't see how that will help the use case described above. How can rook
know that the node is about to shut down. The only taint (or annotation) that is described is for **non-**graceful shutdown which the machine-config-daemon will explicitly not do.
@cgwalters: Do I misunderstand the mechanism?
/remove-lifecycle rotten
/lifecycle frozen
/reopen
@ibotty: Reopened this issue.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.