/dv-operator

The Deployment Validator Operator (DVO) checks deployments and other resources against a curated collection of best practices.

Primary LanguageShell

Deployment Validation Operator

Description

The Deployment Validation Operator (DVO) checks deployments and other resources against a curated collection of best practices.

These best practices focus mainly on ensuring that the applications are fault-tolerant.

DVO will only monitor Kubernetes resources and will not modify them in any way. Instead, it will report failed validations via Prometheus, which will allow users of this operator to create alerts based on its results. All the metrics are gauges that will report 1 if the best-practice has failed. The metric will always have three parameters: name, namespace and kind.

This operator doesn't define any CRDs at the moment. It has been bootstrapped with operator-sdk making it possible to add a CRD in the future if required.

Deployment

The manifests to deploy DVO take a permissive approach to permissions. This is done to make it easier to support monitoring new object kinds without having to change rbac rules. This means that elevated permissions will be required in order to deploy DVO through standard manifests. There is a manifest to deploy DVO though OLM from opereatorhub which does alleviate this need to have elevated permissions.

Manual installation (without OLM)

There are manifests to install the operator under the deploy/openshift directory. A typical installation would go as follows:

  • Create the deployment-validation-operator namespace/project
    • If deploying to a namespace other than deployment-validation-operator, there are commented lines you must change in deploy/openshift/cluster-role-binding.yaml and deploy/openshift/role-binding.yaml first
  • Create the service, service account, configmap, roles and role bindings
  • Create the operator deployment
oc new-project deployment-validation-operator
for manifest in service-account.yaml \
                service.yaml \
                role.yaml \
                cluster-role.yaml \
                role-binding.yaml \
                cluster-role-binding.yaml \
                configmap.yaml \
                operator.yaml
do
    oc create -f deploy/openshift/$manifest
done

Installation via OLM

There is a manifest to deploy DVO via OLM artifacts. This assumes that OLM is already running in the cluster. To deploy via OLM:

  • Generate the deployment YAML from the openshift template
  • Deploy the generated YAML
oc process --local NAMESPACE_IGNORE_PATTERN='openshift.*|kube-.+' -f deploy/openshift/deployment-validation-operator-olm.yaml | oc create -f -

If DVO is deployed to a namespace other than the one where OLM is deployed, which is usually the case, then a network policy may be required to allow OLM to see the artifacts in the DVO namespace. For example, if OLM is deployed in the namespace operator-lifecycle-manager then the network policy would be deployed like this:

oc process --local NAMESPACE='operator-lifecycle-manager' -f deploy/openshift/network-policies.yaml | oc create -f -

Install Grafana dashboard

There are manifests to install a simple grafana dashboard under the deploy/observability directory.

A typical installation to the default namespace deployment-validation-operator goes as follows: oc process -f deploy/observability/template.yaml | oc create -f -

Or, if you want to deploy deployment-validation-operator components to a custom namespace: oc process --local NAMESPACE="custom-dvo-namespace" -f deploy/observability/template.yaml | oc create -f -

Allow scraping from outside DVO namespace

The metrics generated by DVO can be scraped by anything that understands prometheus metrics. A network policy may be needed to allow the DVO metrics to be collected from a service running in a namespace other than the one where DVO is deployed. For example, if a service in some-namespace wants to scrape the metrics from DVO then a network policy would need to be created like this:

oc process --local NAMESPACE='some-namespace' -f deploy/openshift/network-policies.yaml | oc create -f -

Tests

You can run the unit tests via

make test

We use openshift boilerplate to manage our make targets. See this doc for further information.

Roadmap

  • e2e tests