VirtualCluster should support to sync specific conditions of PodStatus from tenant to super
Closed this issue · 14 comments
User Story
As a developer, I would like VirtualCluster to support to sync specific conditions of PodStatus from tenant to super, for some user-defined condition types are designed to be added by users or controllers from tenant cluster.
For example, OpenKruise provides some workloads that support in-place update for Pods, so we define a InPlaceUpdateReady
type readinessGate and add the condition of it into Pod status. However, now syncer will only sync Pod status from super to tenant, which will overwrite the custom condition and lead to Pod always be NotReady because Kubelet can not see this condition in super cluster.
Detailed Description
VirtualCluster syncer should support to sync specific conditions of PodStatus from tenant to super.
/kind feature
Interesting idea, I'm trying to think of how we could accomplish something, it leads us a little into a more split-brain super & tenant then have context that matters vs making spec tenant authoritative for the .spec
and super for the .status
. How does this function under the hood when an update actually happens? Would the syncer need to push patches/updates to the pod specs for that in-place upgrade to be fulfilled? I don't believe we support that as of today at all.
Thanks for filing the issue, I understand the request and I think we should address it. One problem is that both tenant control plane and super cluster may update the conditions specified in the readiness gate. The bigger problem is that the status now has two sources of truth in two etcds, which makes it difficult to avoid overwriting under races.
@FillZpp, a workaround is to add a Pod label by Kruise controller when the readiness gate is ready, to indicate the syncer to set the condition in the super cluster, which I admit is definitely a hack. I cannot think of a more elegant solution at the moment.
Thanks for all your replying.
Would the syncer need to push patches/updates to the pod specs for that in-place upgrade to be fulfilled? I don't believe we support that as of today at all.
Alright, forget about the in-place upgrade, which is just a case we noticed. Actually this is a common request, for Kubernetes provides readinessGate feature to let users control whether a Pod should be ready or not.
kind: Pod
...
spec:
readinessGates:
- conditionType: "www.example.com/feature-1"
status:
conditions:
- type: Ready # a built in PodCondition
status: "False"
lastProbeTime: null
lastTransitionTime: 2018-01-01T00:00:00Z
- type: "www.example.com/feature-1" # an extra PodCondition
status: "False"
lastProbeTime: null
lastTransitionTime: 2018-01-01T00:00:00Z
containerStatuses:
- containerID: docker://abcd...
ready: true
...
When we set a custom conditionType in spec.readinessGates
, kubelet could set Pod as ready only if the custom condition with True
has been added into status.conditions
.
However, if syncer can only push the whole status
from super to tenant
, the custom condition which user adds in tenant cluster will never be synced to super, which means the Pod will never be ready.
Since this is a basic feature provided by Kubernetes, it will be nice if VirtualCluster could support it.
a workaround is to add a Pod label by Kruise controller when the readiness gate is ready, to indicate the syncer to set the condition in the super cluster, which I admit is definitely a hack. I cannot think of a more elegant solution at the moment.
Yeah, it is a workaround that we can deploy two controllers in both super and tenant clusters. The one in tenant adds a specific label into Pod, then the other one in super watches the label and adds condition into status.
I don't know much about the implementation of VirtualCluster, so maybe this is impossible... I'm wondering if we can have an optional white list to let user choose which condition types that should be synced from tenant to super?
Hi, all. Maybe we can use some tricks like protectedMetaPrefixes to setup a list to determine which status condition should be synced from tenant cluster to super cluster? Of course, can we consider the situation more simpler, and only consider the status conditions that occurs on tenant cluster, which need to be synced to super cluster?
@FillZpp You don't need another controller in the super cluster because syncer will be the controller that performs reconciliation. This is a common request to support native k8s readiness gate, hence I think the solution should be provided by the syncer. Also, the syncer already performs special logic to mutate metadata (mainly for Pod), other than simply copying metadata. We only need a mechanism to inform the syncer to add particular conditions. I suggest to start will a special label/annotation. Of course, this requires code changes for controllers that use readiness gates like Kruise in order to integrate with VC. Hope this is acceptable.
@wondywang I don't recommend to change VC CRD to support this feature like what we did for label/annotations using prefixes. The reasons are:
-
VC CRD specifies cluster-wise properties or policies. However, the readiness gate is a per-workload specification. There potentially can have conflicts if multiple workloads that have readiness gates are configured in the same VC. For example, both controller A & B use condition X for readiness. One day controller A still uses the condition but decides not to sync the condition X to the super cluster for some reason and it cannot request to change the VC CR because controller B still needs the condition. It is problematic to specify per workload specification in a CRD that represents a whole cluster.
-
The workflow of supporting new workload in VC is getting complicated. For example, VC users need to inform the cluster administrator to mutate the VC spec in order to make their controllers work.
Due to the above, I think a special label/annotation can be a reasonable solution for this feature considering the flexibility and the complexity.
@Fei-Guo @FillZpp I agree there should be a mechanism to config the down/up syncing behave,but can we let a default configuration to support a quite common scenario ? The most common one is :The kurise is only installed in Tenat Cluster, let it work as it works as in Super Cluster, while Kurise needn't to any change anying.
Other scenario, for example some controller will work in Super ,even both Tenant and Super,we can support them by configing some special Labe to guide the complex behavior.
already supported in this pr kubernetes-retired/multi-tenancy#1294
did you meet some bug in testing?
Thanks @zhuangqh. I'll check it out. There is one case of feedback received so far.
already supported in this pr kubernetes-sigs/multi-tenancy#1294 did you meet some bug in testing?
This implementation is straightforward but there is a very tiny race window in theory such that tenant condition can be overwritten. If the readiness condition is updated by tenant at the time between CheckUWPodStatusEquality
and vPod.Update()
, the readiness condition can be overwritten by the vPod.Update()
call. Syncer was not designed to handle all kinds of races, it uses periodic scanning to find the mismatches the fix them. The problem here is that once the readiness condition is removed accidentally, the tenant controller may not reconcile and recover it since it expects the condition cannot be removed once it is set. To eliminate the race, the tenant controller can notify syncer and let the syncer update the readiness conditions in the super cluster and reconcile for them.
Not sure if Kruise controller hits this race though.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.