/managed-cluster-validating-webhooks

Set of validating webhooks, based on k8s admission controllers, for OSD clusters

Primary LanguageGoApache License 2.0Apache-2.0

Managed Cluster Validating Webhooks

A framework supporting validating admission webhooks for OpenShift.

Updating SelectorSyncSet Template

Ensure the git branch is current and run make syncset. The updated Template will be build/selectorsyncset.yaml by default.

Updating namespace and service account list

Ensure the git branch is current and run make generate. The updated lists will be written to pkg/config/namespaces.go. Documentation should also be regenerated to ensure the ConfigMaps specified are up-to-date.

Updating documenation files

Ensure the git branch is current and run make docs > docs/webhooks.json && make DOCFLAGS=-hideRules docs > docs/webhooks-short.json.

Development

Each Webhook must register with, and therefore satisfy the interface specified in pkg/webhooks/register.go:

// imports shown here for clarity
import (
  admissionregv1 "k8s.io/api/admissionregistration/v1"
  metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
  admissionctl "sigs.k8s.io/controller-runtime/pkg/webhook/admission"
)

// Webhook interface
type Webhook interface {
	// Authorized will determine if the request is allowed
	Authorized(request admissionctl.Request) admissionctl.Response
	// GetURI returns the URI for the webhook
	GetURI() string
	// Validate will validate the incoming request
	Validate(admissionctl.Request) bool
	// Name is the name of the webhook
	Name() string
	// FailurePolicy is how the hook config should react if k8s can't access it
	FailurePolicy() admissionregv1.FailurePolicyType
	// MatchPolicy mirrors validatingwebhookconfiguration.webhooks[].matchPolicy.
	// If it is important to the webhook, be sure to check subResource vs
	// requestSubResource.
	MatchPolicy() admissionregv1.MatchPolicyType
	// Rules is a slice of rules on which this hook should trigger
	Rules() []admissionregv1.RuleWithOperations
	// ObjectSelector uses a *metav1.LabelSelector to augment the webhook's
	// Rules() to match only on incoming requests which match the specific
	// LabelSelector.
	ObjectSelector() *metav1.LabelSelector
	// SideEffects are what side effects, if any, this hook has. Refer to
	// https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#side-effects
	SideEffects() admissionregv1.SideEffectClass
	// TimeoutSeconds returns an int32 representing how long to wait for this hook to complete
	TimeoutSeconds() int32
	// Doc returns a string for end-customer documentation purposes.
	Doc() string
	// SyncSetLabelSelector returns the label selector to use in the SyncSet.
	// Return utils.DefaultLabelSelector() to stick with the default
	SyncSetLabelSelector() metav1.LabelSelector
	// HypershiftEnabled will return boolean value for hypershift enabled configurations
	HypershiftEnabled() bool
}

The first four methods (Authorized, GetURI, Validate and Name) are involved with the process of handling the incoming JSON payload from the API server. GetURI and Name prepare the webserver to send to determine if the request is Authorized and to Validate ensures that the structure of the AdmissionRequest is appropriate, not if the request should be permitted (this functionality is up to the webhook author to implement, and is not part of the interface).

The remaining methods (and also including GetURI and Name) are involved with rendering YAML.

Adding New Webhooks

Registering involves creating a file in pkg/webhooks (eg add_namespace_hook.go) which calls the Register function exported from register.go:

// pkg/webhooks/add_namespace_hook.go

package webhooks

import (
  "github.com/openshift/managed-cluster-validating-webhooks/pkg/webhooks/namespace"
)

func init() {
  Register(namespace.WebhookName, func() Webhook { return namespace.NewWebhook() })
}

The signature is Register(string, WebhookFactory), where a WebhookFactory is type WebhookFactory func() Webhook.

Helper Utils

The utils package provides a string slice content checker (SliceContains(string, []string) bool) since it's a common task to see if a group or username is a member of some safelisted list.

Mutating Webhooks

Despite its name, this repository has basic support for deploying mutating webhooks alongside validating ones due to their similarity. The differences between the two webhook types boil down to the types of decisions (Responses) they're allowed to return to the API server. Just like validating webhooks, mutating webhooks can decide that a request is Allowed, Denied, or Errored (see Building a Response below). Unlike validating webhooks, however, mutating webhooks may instead decide that a request can be allowed only if some changes are made (i.e., Patched). Patched decisions contain a RFC 6902 (JSONPatch) string that describes the necessary mutations.

For example, the service-mutation webhook enforces an AWS managed policy requirement that ELBs are tagged with red-hat-managed=true by mutating all CREATE and UPDATE operations on LoadBalancer-type Services such that they contain the annotation service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags: red-hat-managed=true. For a CREATE operation on a Service that's missing the necessary annotation, the JSONPatch embedded within the Patched Response might look like:

{
    "op": "add",
    "path": "/metadata/annotations/service.beta.kubernetes.io~1aws-load-balancer-additional-resource-tags",
    "value": "red-hat-managed=true"
}

MutatingWebhooks are indicated by their name: if your Webhook's Name() function returns a string ending in -mutation, then resources.go will generate a MutatingWebhookConfiguration (instead of a ValidatingWebhookConfiguration) when building the SelectorSyncSet and PKO package. Beyond that, this repo does not descriminate between MutatingWebhooks and ValidatingWebhooks, and you may assume any documentation in this repo applies to both Webhook types unless otherwise noted.

Is The Request Valid and Authorized

The key difference between "valid" and "authorized" is that the former is asking if the incoming request is well-formed whereas the latter is asking if the user making the request is allowed to do so. Each webhook may have a different idea of what a "valid" request looks like, but some common feature may be if the request has a username set.

Validate and Authorized serve as two entry points for the webserver: First it will ask if the request is valid and then it will ask if it is authorized. Thus, we can speak more about fulfulling the interface requirements for these two methods.

Building a Response

To create a Response object (to reply to the incoming AdmissionRequest), one should use sigs.k8s.io/controller-runtime/pkg/webhook/admission (often imported as admissionctl), which provides several helper functions:

  • Allowed(message string)
  • Denied(message string) Response
  • Errored(error int32, message string) Response
  • Patched(message string, patches ...jsonpatch.JsonPatchOperation) Response (mutating webhooks only)

Use these functions once access has been determined, or in the event of some fundamental problem. A common use for Errored is when Validate fails. Refer to Sending Responses for methods related to sending these Response objects back over the HTTP connection.

It is important to retain the UID from the incoming request with the outgoing response. A common pattern for this is:

  var ret admissionctl.Response
  ret = admissionctl.Allowed("Request is allowed")
  ret.UID = request.AdmissionRequest.UID
  return ret

Mutating webhooks, however, should use admissionctl.Complete() instead of manually setting the UID when issuing Patched decisions. For example:

  var ret admissionctl.Response
  ret = admissionctl.Patched("Request is only allowed if mutated", jsonPatchOp)
  ret.Complete(request)
  return ret

Sending Responses

Once a response is built, it must be sent back to the HTTP client. This is done by returning the admissionctl.Response in the Authorized method. This structure can be built up with the helpers mentioned above.

Writing Unit Tests

Unit tests are important to ensure that webhooks behave as expected, and to help with that process, there is a testutils helper package designed to unify several processes. The package exports:

  • CanCanNot
  • CreateFakeRequestJSON
  • CreateHTTPRequest
  • SendHTTPRequest

The first function, CanCanNot, is very simple and designed to make test failure messages gramatically correct for. The three other functions are much more important to the testing process.

The three helper functions are intended to provide for more integration style tests than true unit tests, as they assist in turning a specific set of test criteria a JSON representation and sending via net/http/httptest to the webhook's Authorized. When using testutils.SendHTTPRequest, the response is a Response object that can be used in the test suite to access the result of the webhook.

Local Live Testing

Build and test your changes against your own cluster. Here is a recorded demo of this process.

Create a Repository

Make sure you have a repository to host the image you are going to build. It must be public so your cluster is able to download the image. For subsequent steps, you will need the name of the registry, the organization, and the repository. For example, in the image URI quay.io/my-user/managed-cluster-validating-webhooks:latest:

  • quay.io is the registry
  • my-user is the organization
  • managed-cluster-validating-webhooks is the repository
  • latest is the image tag

Build and Push the Image

Use make build-base to build and tag the image; and make push-base to push it. In order to use your personal repository, you can override any of the components of this URI by setting the following variables:

  • IMG_REGISTRY overrides the registry (default: quay.io)
  • IMG_ORG overrides the organization (default: app-sre)
  • BASE_IMG overrides the repository name (default: managed-cluster-validating-webhooks)
  • IMAGETAG overrides the image tag. (By default this is the current commit hash of your local clone of the git repository; but make build-base will also always tag latest)

For example, to build, tag, and push quay.io/my-user/managed-cluster-validating-webhooks:latest, you can run:

make IMG_ORG=my-user build-base push-base

Prevent Overwriting (Hive-Managed Clusters)

If you are using an OSD cluster managed by hive, the resources in the openshift-validation-webhook project are controlled by SelectorSyncSets. By default, hive will periodically refresh these resources, reverting any changes you make according to the instructions below. To avoid this, pause hive syncing for your cluster.

Pare Down Your Daemonset (Optional)

Note: Editing the daemonset requires elevated privileges.

By default, the image will run on all master nodes. For testing purposes, you may find it easier to run on only one node. To facilitate this:

  1. Pick a master node to use. Any one will do. For example:
    $ oc get node | grep master | head -1
    ip-10-0-147-186.ec2.internal   Ready    master         3h44m   v1.19.0+f173eb4
    
  2. Give the node a unique label. For example:
    $ oc label node ip-10-0-147-186.ec2.internal mcvw-test=true
    node/ip-10-0-147-186.ec2.internal labeled
    
  3. Edit the validation-webhook daemonset's .spec.template.spec.affinity.nodeAffinity, replacing the matchExpressions entry for node-role.kubernetes.io/master to match your label instead. For example, to match the label from step 2, run:
    oc edit -n openshift-validation-webhook daemonset validation-webhook
    
    and replace
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                - matchExpressions:
                  - key: node-role.kubernetes.io/master
                    operator: In
                    values:
                    - ""
    
    with
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                - matchExpressions:
                  - key: mcvw-test   # <== your label's key
                    operator: In
                    values:
                    - "true"         # <== your label's value
    

Once these changes are saved, you should see validation-webhook-* pods cycle. When they have settled, you can confirm that only one pod is running and that it is running on your desired node. For example:

$ oc describe pod -n openshift-validation-webhook | grep ^Node:
Node:         ip-10-0-147-186.ec2.internal/10.0.147.186

Deploy the Image

Note: Editing the daemonset requires elevated privileges.

Edit the validation-webhook daemonset's .spec.template.spec.containers[0].image, replacing it with the URI of the image you built and pushed above. For example, if you created quay.io/my-user/managed-cluster-validating-webhooks:latest, replace

        image: quay.io/app-sre/managed-cluster-validating-webhooks@sha256:f33f879a9e8b0dc5b0481f75ba5eb8422a8a9b06acf70330794eb564ee31e9e5

with

        image: quay.io/my-user/managed-cluster-validating-webhooks:latest

(You can specify your custom image by tag or by digest; the latter is only necessary if quay is down.)

Once these changes are saved, you should see validation-webhook-* pods cycle. When they have settled, you can confirm that the pod(s) are running with your image. For example:

$ oc describe pod -n openshift-validation-webhook | grep '^ *Image:'
    Image:         quay.io/my-user/managed-cluster-validating-webhooks:latest

Create the ValidatingWebhookConfiguration

Once the image is pulled to your cluster, you will need to create the ValidatingWebhookConfiguration. You should notice that the build/selectorysyncset.yaml file will have a new section containing your webhook's ValidatingWebhookConfiguration. It should look something similar to this.

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  annotations:
    service.beta.openshift.io/inject-cabundle: "true"
  creationTimestamp: null
  name: sre-new-webhook
webhooks:
- admissionReviewVersions:
  - v1
  clientConfig:
    service:
      name: new-webhook
      namespace: openshift-validation-webhook
      path: /networkpolicies-validation
  failurePolicy: Ignore
  matchPolicy: Equivalent
  name: new-webhook-validation.managed.openshift.io
  rules:
  - apiGroups:
    - networking.k8s.io
    apiVersions:
    - '*'
    operations:
    - CREATE
    resources:
    - networkpolicies
    scope: Namespaced
  sideEffects: None
  timeoutSeconds: 2

Save this part of the selectorsyncset.yaml as a its own yaml file and apply it to your cluster.

oc apply -f my_webhook.yaml

Now you can view that your webhook is registered and running with:

oc get validatingwebhookconfigurations -A

Update Other Resources

If your changes resulted in a delta to selectorsyncset.yaml, you must manually edit the corresponding resources to apply those changes.

TODO: More details on this.

Test Your Changes

Now that your cluster is running your modified code, you can do whatever is necessary to validate your changes.

End to End Testing

End to End testing is managed by the osde2e repo

Disabling Webhooks

List the webhooks (if you don't know them already):

# go run build/resources.go -syncsetfile tmp -showhooks
clusterlogging-validation
hiveownership-validation
imagecontentpolicies-validation
namespace-validation
pod-validation
prometheusrule-validation
regular-user-validation
regular-user-validation-osd
scc-validation
techpreviewnoupgrade-validation

At this point, we have:

  • clusterlogging-validation
  • hiveownership-validation
  • imagecontentpolicies-validation
  • namespace-validation
  • pod-validation
  • prometheusrule-validation
  • regular-user-validation
  • regular-user-validation-osd
  • scc-validation
  • techpreviewnoupgrade-validation
  • ingress-config-validation

In the Makefile, the SELECTOR_SYNC_SET_HOOK_EXCLUDES variable is used to control which are excluded. By default, it is set to debug-hook, in case one should come to appear at some time in the future.

Temporarily disable the identity-validation and namespace-validation hooks, set that same variable in the Makefile:

SELECTOR_SYNC_SET_HOOK_EXCLUDES ?= debug-hook,identity-validation,namespace-validation

Then at the shell run:

# make syncset
docker run \
    -v /Users/youruser/git/github.com/openshift/managed-cluster-validating-webhooks:/Users/youruser/git/github.com/openshift/managed-cluster-validating-webhooks \
    -w /Users/youruser/git/github.com/openshift/managed-cluster-validating-webhooks \
    --rm \
    golang:1.14 \
      go run \
        build/resources.go \
        -exclude identity-validation,namespace-validation \
        -outfile build/selectorsyncset.yaml
# truncated ...

Commit the Makefile and resulting build/selectorsyncset.yaml and deploy it with the normal workflows.

Removing a Webhook

To delete a webhook one must delete the associated files and re-run make. Rerunning make will rebuild the binary, container image, and build/selectorsyncset.yaml file. The files are the add_ files as well as the entire package. To remove the Namespace webhook:

# rm -fr pkg/webhooks/add_namespace_hook.go pkg/webhooks/namespace
pkg/webhooks/add_namespace_hook.go
pkg/webhooks/namespace/namespace_test.go
pkg/webhooks/namespace/namespace.go
pkg/webhooks/namespace
# make

Commit all changes and deploy as normal.

Once the code changes are complete, remove the undesired ValidatingWebhookConfiguration object(s) manually from the cluster.