CloudKnit is an open-source progressive delivery platform
for managing cloud environments. It enables organizations to Define entire environments in a declarative way, Provision them, Detect and Reconcile Drift, and Teardown environments when no longer needed. It also comes with dashboards to help visualize environments and observe them.
CloudKnit is based on a concept called Environment as Code. Some people have started calling it Declarative Pipelines.
Note: We are not a big fan of using Pipeline and Declarative together as Pipeline to us means a sequence of steps which conflicts with what Declarative means.
Environment as Code (EaC) is an abstraction over cloud-native tools that provides a declarative way of defining an entire environments. It has a Control Plane that manages the state of the environment, including resource dependencies, and drift detection and reconciliation.
Diagram 1: Where does CloudKnit fit in with existing toolsThere are tools today that allow us to manage cloud environments but as the environments become more complex and teams look for advanced use cases, existing tools can fall short. This causes some teams to build and maintain in-house solutions.
We want to make it easy for DevOps and Platform Engineering teams to manage complex environments. Existing cloud-native tools like Terraform, Pulumi, Helm, ArgoCD, etc. on their own can be great at managing individual components within an environment or automating simple environments. However an environment like the one below (Diagram 2) with Infrastructure resources and cloud-native applications requires a different solution. Currently you have two main approaches:
Monolith IaC & Application deployments work well when environments are simple, but as things get complex, it becomes a nightmare to maintain. It creates tight coupling and causes issues like slow provisioning.
Creating loosely-coupled components like networking, eks, rds, backend-apps, frontend-apps, etc. makes the individual components easier to manage, but the pipelines that run those components become complex. Pipeline code needs to manage the logic to run the various components in the correct order, handle failures and tear down unused resources.
Pipeline code is imperative, and users have to write logic on “HOW” to get an entire environment. This causes a maintenance nightmare. We have seen teams write hundreds of lines of unmaintainable pipeline code.
Diagram 2: Example EnvironmentThere are other challenges that teams face as their environments become more complex.
- Environment Replication is a pain
- Not easy to Visualize/Understand Environments
- Drift Detection for the entire environment is difficult
- Not straightforward to Promote changes across environments
Environment management with CloudKnit is divided into 4 stages:
This stage allows you to define an entire environment. We currently support easy to use YAML format for the environment definition.
See example below:
Environment Definition
apiVersion: stable.cloudknit.io/v1
kind: Environment
metadata:
name: zmart-payment-prod-blue
namespace: zmart-config
spec:
teamName: payment
envName: prod-blue
components:
- name: networking
type: terraform
autoApprove: true
module:
source: git@github.com:terraform-aws-modules/terraform-aws-vpc.git
variablesFile:
path: "prod-blue/vars/networking.tfvars"
outputs:
- name: vpc_id
- name: platform-eks
type: terraform
dependsOn: [networking]
module:
source: git@github.com:terraform-aws-modules/terraform-aws-eks.git
variables:
- name: vpc_id
valueFrom: networking.vpc_id
variablesFile:
path: "prod-blue/vars/platform-eks.tfvars"
- name: website
type: helm #Native support for helm chart coming soon
dependsOn: [platform-eks]
source:
repo: git@github.com:helm/examples.git
path: charts/hello-world
variables:
- name: environment
value: prod-blue
CloudKnit Control Plane running in Kubernetes uses the Environment definition & runs various Components (Terraform, Helm Charts, etc.) in the right order. It also provides Visibility & Workflow while Provisioning the environment.
Like Kubernetes does drift detection for k8s apps & reconciles them to match the desired state in source control, CloudKnit does drift detection for the entire environment (infra + apps) & reconciles them.
Note: In case of Infrastructure as Code CloudKnit provides an ability to see the plan & get manual approval before running the IaC to make sure it doesn't destroy any resources you don't want to, especially in Production environments.
You might want to teardown environments when they are not used to save costs. CloudKnit provides a single line change using flag teardown
in the Environment YAML. Once teardown
flag is set to true and definition is pushed to Source Control, CloudKnit picks up the change and tears the environment down by destroying individual components in the correct order.
CloudKnit also provides visibility into your environments and an optimal GitOps workflow with useful information on the UI like estimated costs/status etc. Check diagram 4 below for an example environment in CloudKnit UI.
Diagram 4: Environment VisibilityWe hope that by open-sourcing CloudKnit early, we can form a close-knit open-source community around it to make managing cloud environments easy.
For a deeper dive into CloudKnit, see the architectural overview, our documentation, and the GitHub repo.
Components: A logical grouping of 1 or more Infrastructure Resources or Applications that get provisioned together. For example, Networking is an Infrastructure Component with various Infrastructure resources like Virtual Private Cloud(VPC), Subnets, Internet Gateways, Route Tables, etc.
Environment: A logical grouping of all the Components needed to run business applications. The grouping includes components like networking, eks, database, k8s apps, etc.