kubernetes-retired/cluster-api-provider-nested

Performance and Resource Contention in Multiple Tenant Environment

Closed this issue · 9 comments

User Story
Would like to add:

  • Plugin to validate and limit total resource usage of a virtual cluster (tenant)
  • Add per virtual cluster working queue and workers in mccontroller to prevent contention among tenants.

Detailed Description
Performance and resource contention among tenants are not fully addressed by virtual cluster. More monitoring or enforcement logics are needed to control total resource usage of a tenant.

Resource Quota:

In addition to per namespace based admission quota control, overall resource quota needs to be checked for each tenant (virtual cluster) across multiple tenant’s namespaces. This is to limit tenant’s overall resource usage in a multi-tenant environment.

We would like to propose adding resource creation validation logics in Pod syncer. Pod Syncer is a perfect location to perform the monitoring, validation, and enforcement. For example before a pod is going to be created in super cluster in “reconcilePodCrete()”, a validation logic can be injected. If the validation fails, the pod object is kept in tenant virtual cluster with pending state. The pod will be later reconciled and got synced to super cluster once resources become available.

Given quota enforcement is proprietary to each cloud environment, customers may choose their own way to perform the validation, we would like to add a plugin in Pod Syncer to allow customers to use their own validation logics. Customers can register the validation plugin at Syncer startup. Pod syncer then calls the plugin in “reconcilePodCrete()” before creating pod in super cluster.

In addition, a new label can be added in pod that does not pass the validation. Pod Syncer can bypass re-reconciling of the pods to reduce un-necessary work load. A separate user logic can be added to take the label off from pending pods when resource become available.

Syncer Performance Contention:

Currently, mccontroller uses MaxConcurrentReconciles =3 of workers to run syncing for all tenants’ virtual clusters. It would be nice to have per tenant (virtual cluster) work queue and per tenant workers. This is to prevent noisy tenant from interfere other tenant’s syncing job. This also provides the ability to serialize pod syncing for each tenant, which is critical for certain use cases --- for example to get accurate resource usage status for resource quota validation.

Hey @weiling61 thanks for filing this. I think there might be two parts to this issue that we can probably untease. 1. a plugin architecture that would allow someone to customize the builtin resource syncers For now I'm going to focus on the plugin architecture. (I believe we have other outlets to improve performance of the max workers with how the fair queue is setup per tenant)

Understanding the Plugin mutators and validators. I can see a use case for adding validation and mutation plugins to each synced if it was easy to "register" them with the core from your own compiled synced. Looking at the current syncer arch it seems like we could have two additional plugin registration packages added like - https://github.com/kubernetes-sigs/cluster-api-provider-nested/tree/main/virtualcluster/pkg/util/plugin or add Validation/Mutation as fields on the Registration struct (https://github.com/kubernetes-sigs/cluster-api-provider-nested/blob/9f6b1a096b7ff120a0165c711d7c45a3a60cce6a/virtualcluster/pkg/util/plugin/plugin.go#L31-L40). Mutation could easily for default syncers use mutation package if we changed this slightly to make a Mutator interface like so:

type Mutator interface {
	Mutate(obj client.Object, clusterName string)
}

Given that each of the current mutators do a similar style but use explicit types changing to a generic client.Object param would it to support any type. Doing this we would then register the specific mutators in each resource syncers init func. We could do the exact same for validators …

type Validator interface {
	Validate(obj client.Object, clusterName string)
}
func init() {
	plugin.SyncerResourceRegister.Register(&plugin.Registration{
		ID: "pod",
		InitFn: func(ctx *plugin.InitContext) (interface{}, error) {
			return NewPodController(ctx.Config.(*config.SyncerConfiguration), ctx.Client, ctx.Informer, ctx.VCClient, ctx.VCInformer, manager.ResourceSyncerOptions{})
		},
		MutatorFn: func(ctx *plugin.InitContext) (interface{}, error) {
			return conversion.PodMutator(ctx.Config.(*config.SyncerConfiguration), ctx.Client, ctx.Informer, ctx.VCClient, ctx.VCInformer, manager.ResourceSyncerOptions{})
		},
		ValidatorFn: func(ctx *plugin.InitContext) (interface{}, error) {
			return conversion.PodValidator(ctx.Config.(*config.SyncerConfiguration), ctx.Client, ctx.Informer, ctx.VCClient, ctx.VCInformer, manager.ResourceSyncerOptions{})
		}
	})
}

This is all sudo code of course so it might not work completely as expected but I could see this working as an easy way to customize the syncers.

Alternatively this could be configured using the super clusters admission stack but that requires that super clusters have additional resources stored in them and any mutators cause the UWS to have issues.

cc @Fei-Guo would be curious your thoughts?

I think MutatorFn and ValidatorFn plugins should not be registered in pod init().? The whole purpose is to let user to register the plugins outside current CAPN syncer code. Using pod syncer as an example, I propose to have a validation function being called at very beginning of "reconcilePodCreate()" and have mutation function being called at very end of "reconcilePodCreate()". By default, no plugins are installed, the 2 functions will do nothing. User can register their pod validation and mutation function using "plugin.SyncerResourceRegister.Register" to let pod syncer execute the 2 functions in "reconcilePodCreate()" .

"Syncer Performance Contention" we use fair queuing for the dws requests hence the contention should have been addressed.

Why not implementing webhook for quota check in the super cluster? In my opinion, we should not extend syncer for quota check. The syncer is just for state synchronization, it should avoid doing extra logic as much as possible.

Yes Webhook can do the job. However, given current syncer architecture, syncer is a sweet spot for adding more user logics, ---- 1) more efficient than webhook, 2) provides better overall control, since it interacts with super cluster api-server directly, user can implement synchronous operations/validations.
For quota management, a synchronous operation is crucial, it minimizes the locking of entire resource creation process

I think we agree to make the extension for capacity planning purpose only. @weiling61 It will be nice to come up with a design proposal for that. Looking forward to it. Thanks

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.