/konfirm

Automated integration testing in your Kubernetes clusters.

Primary LanguageGoGNU General Public License v3.0GPL-3.0

Konfirm: Automated In-Cluster Testing

Build Passing Quality Gate Status OpenSSF Scorecard

Konfirm automates the execution of tests in a Kubernetes cluster, enabling automatic validation of in-cluster services and simple integration testing for application teams.

How It Works

In Konfirm, tests are managed by the top-level TestSuite custom resource. TestSuites are executed as soon as they are created and can optionally be periodically triggered by either a cron schedule, an observed Helm release, or manually by removing all associated TestRuns (see when below for more details).

Each TestSuite execution is managed by a TestRun resource, which in turn creates each of the suite's Test resources. Test resources execute tests by creating Pods, managing Pods across evictions and other preemptions until completion. If the Pod succeeds (has an exit code zero), the Test succeeds. If the Pod fails (has a non-zero exit code) the Test fails. Messages written to the containers' termination messages area retained by the Test resource and then by the TestRun resource. By default, failed Pods are retained to allow their logs to be inspected (although Kubernete's typical garbage collection applies and log retention is not permanent). Finally, TestRuns are considered to pass only if all associated Tests pass, and TestSuites are either passing or failing based on their most recent TestRun.

Konfirm exposes TestSuites' current pass/fail status (and other valuable runtime data) as Prometheus metrics, allowing admins to use existing in-cluster tooling to observe and alert on test results. For example, TestSuite watching the istiod Helm release can automatically validate the Istio control plane's behavior after each upgrade and immediately alert admins of any problems.

The TestSuite Resource

apiVersion: konfirm.teamraft.com/v1beta1
kind: TestSuite
metadata:
  namespace: a-namespace
  name: my-test
spec:
  when:
    cron: 0 7 * * 1-5
    helmRelease: istiod
  runAs: a-user
  historyLimit: 3
  template:
    setUp:
      helm:
        secret: my-test-setup
    spec:
      retentionPolicy: OnFailure
      tests:
        - description: sidecar injection is enabled by default
          template:
            spec:
              containers:
                - name: main
                  image: konfirm/check-istio
                  args:
                    - has-sidecar

TestSuite Spec

spec.when

The when property defines when a TestSuite will be triggered. It supports to types of triggers: schedule and helmRelease.

spec.when.schedule

The schedule property of when causes the TestSuite to be triggered on the defined schedule. If the schedule is valid, the NextRun property in the TestSuite's status will be set. Otherwise, the InvalidSchedule condition will be set. To prevent runaway resource utilization, schedules are based on the completion time of the most recent TestRun, meaning parallel tests will not be executed. However, if a schedule is missed it will be executed as soon as the Konfirm manage observes the missed run.

spec.when.helm

The helm property of when causes the TestSuite to be triggered on any new release, upgrade, or downgrade of the specified Helm release. Releases in other namespaces can be watched using [RELEASE].[NAMESPACE] notation, however cross-namespace observability must be explicitly enabled with a HelmPolicy.

spec.runAs

runAs specifies a UserRef that Konfirm will use for impersonation during setup, teardown, and pod creation/deletion. runAs utilizes UserRefs to explicitly separate the management of impersonation, allowing cluster admins to appropriately restrict what testers can do.

spec.historyLimit

historyLimit specifies the maximum number of TestRuns to retain. The default is 3, and the minimum is 1.

spec.template.setup

setup defines steps that must be completed before a TestRun is executed.

spec.template.setup.helm

helm references a Secret with chart, username, password, and values keys. chart is a URL to a Helm chart, and username and password are the optional credentials used to retrieve the chart. values is the YAML-encoded values used by the release. If helm is defined, a successful Helm release is required before a TestRun is created. Teardown is performed via a Helm uninstall and must be successfully completed before a new TestRun can be executed.

spec.template.spec

The spec property of template defines how a TestSuite is executed.

spec.template.spec.retentionPolicy

retentionPolicy defines when Test resources are retained. The default is OnFailure, but Always and Never are also supported.

spec.template.spec.tests

tests define the individual Tests to be executed.

spec.template.spec.tests.description

description is a brief, user-friendly description of the test

spec.template.spec.tests.template

In addition to an extremely repetitive path, template is a PodTemplateSpec that defines the Pod a Test will create.

Security

Konfirm was designed for security from the very beginning to ensure it was safe to run in mission-critical production environments. Its security features include:

  • All mutating Pod operations are performed using impersonation that is managed separately from other Konfirm resources, allowing cluster admins to limit what tests can do either cluster-wide or per namespace.
  • By default, Helm releases are only observable to tests in the same namespace. A HelmPolicy must be defined to permit observability from other namespaces.
  • All Secret operations—which are required to observe Helm releases—are restricted to read-only operations on metadata.