Continues Deployment Target - Operator

Automate the configuration & lifecycle of Azure self-hosted pipelines agents and enable self-service for adding egress targets, without the need of delegating full network policy permissions to the namespace administrator. Event driven autoscaling is automatically enabled trough KEDA and Azure pipelines integrations.

Operator Design

Describing the problem

For us as namespace administrators (cluster users) the CRUD functionality on network policy objects are unauthorized by security design and can only be changed by the cluster administrators. To enable end tot end automation, we need the ability to add target IPs ourselves to a specified set of allowed egress ports through a Custom Resource, the ports are specified by the cluster administrators from centralized configuration. An Operator should automatically create or update a network policy containing the specified IPs defined in the CustomResource. The operator should als configure and manage the lifecycle of the self-hosted pipeline agents, be able to inject proxy configurations and CA certificates trough Kubernetes secrets and simplify the enablement of event driven autoscaling.

Designing the API and a CRD

K8s Network Policy NetworkPolicyPeer API spec

NetworkPolicyPeer describes a peer to allow traffic to/from. Only certain combinations of fields are allowed

   .  egress.to.ipBlock (IPBlock)

    IPBlock defines policy on a particular IPBlock. If this field is set then neither of the other fields can be.

    IPBlock describes a particular CIDR (Ex. "","2001:db9::/64") that is allowed to the pods matched by a NetworkPolicySpec's podSelector. The except entry describes CIDRs that should not be included within this rule.

        . egress.to.ipBlock.cidr (string), required

        CIDR is a string representing the IP Block Valid examples are "" or "2001:db9::/64"

        . egress.to.ipBlock.except ([]string)

        Except is a slice of CIDRs that should not be included within an IP Block Valid examples are "" or "2001:db9::/64" Except values will be rejected if they are outside the CIDR range

IPBLock type

type IPBlock struct {
	// CIDR is a string representing the IP Block
	// Valid examples are "" or "2001:db9::/64"
	CIDR string `json:"cidr" protobuf:"bytes,1,name=cidr"`
	// Except is a slice of CIDRs that should not be included within an IP Block
	// Valid examples are "" or "2001:db9::/64"
	// Except values will be rejected if they are outside the CIDR range
	// +optional
	Except []string `json:"except,omitempty" protobuf:"bytes,2,rep,name=except"`

CDTarget types

// CDTargetSpec defines the desired state of CDTarget
type CDTargetSpec struct {
	// IP is a slice of string that contains all the CDTarget IPs
	IP []string `json:"ip,omitempty"`
	// specify the pod selector key value pair
	AdditionalSelector map[string]string `json:"additionalSelector"`
	// pipeline agent image
	AgentImage string `json:"agentImage,omitempty"`
	// +optional
	AgentResources corev1.ResourceRequirements `json:"agentResources,omitempty"`
	// image pull secrets
	ImagePullSecrets []corev1.LocalObjectReference `json:"imagePullSecrets,omitempty"`
	// +optional
	MinReplicaCount *int32 `json:"minReplicaCount,omitempty"`
	// +optional
	MaxReplicaCount *int32 `json:"maxReplicaCount,omitempty"`
	// Inject additional environment variables to the deployment
	Env []corev1.EnvVar `json:"env,omitempty"`
	// reference to secret that contains the the Proxy settings
	ProxyRef string `json:"proxyRef,omitempty"`
	// reference to secret that contains the PAT
	TokenRef string `json:"tokenRef"`
	// reference to secret that contains the CA certificates
	CACertRef string `json:"caCertRef,omitempty"`
	// AzureDevPortal is configuring the Azure DevOps pool settings of the Agent
	// by using additional environment variables.
	Config AgentConfig `json:"config,omitempty"`
	// set to add or override the default metadata for the
	// scaled object trigger metadata
	TriggerMeta map[string]string `json:"triggerMeta,omitempty"`

// CDTargetStatus defines the observed state of CDTarget
type CDTargetStatus struct {
	// Conditions lists the most recent status condition updates
	Conditions []metav1.Condition `json:"conditions"`

// control the pool and agent work directory
type AgentConfig struct {
	URL       string `json:"url"`
	PoolName  string `json:"poolName"`
	AgentName string `json:"agentName,omitempty"`
	WorkDir   string `json:"workDir,omitempty"`
	// Allow specifying MTU value for networks used by container jobs
	// useful for docker-in-docker scenarios in k8s cluster
	MTUValue string `json:"mtuValue,omitempty"`

Custom Resource schema

apiVersion: cnad.gofound.nl/v1alpha1
kind: CDTarget
  name: <<cdtarget-sample>>
  namespace: <<test>>
  agentImage: ghcr.io/bartvanbenthem/azagent-keda-22:latest
  - name: <<cdtarget-regcred>>
  minReplicaCount: 1
  maxReplicaCount: 3
      cpu: 100m
      cpu: 200m
    url: <<https://dev.azure.com/ORGANIZATION>>
    poolName: <<pool-name>>
  tokenRef: <<cdtarget-token>>
  proxyRef: <<cdtarget-proxy>>
  caCertRef: <<cdtarget-ca>>
  dnsPolicy: <<None>>
    - <<>>
    demands: "maven,docker"
    value: example-env
    app: cdtarget-agent 
  - <<>>
  - <<>>

Required Resources & Permissions

What other resources are required:


Target reconciliation loop design

func Reconcile:
// Get the Operator's CRD, if it doesn't exist then return
// an error so the user knows to create it:
operatorCrd, error = getMyCRD()
    if error != nil {
    return error
// Get the related resources for the Operator (networkpolicy)
// If they don't exist, create them:
resources, error = getRelatedResources()
if error == ResourcesNotFound {
// Check that the related resources relevant values match
// what is set in the Operator's CRD. If they don't match,
// update the resource with the specified values:
if resources.Spec != operatorCrd.Spec {

Handling upgrades and downgrades


Failure reporting

Logs, Events + status updates

  - lastTransitionTime: 2022-01-01T00:00:00Z
    message: reconciling message
    reason: event
    status: "False"/"True"
    type: ReconcileSuccess


Install KEDA

# Deploying using the deployment YAML files
kubectl apply --server-side -f \

Scaffolding parameters

operator-sdk init --domain gofound.nl --repo github.com/bartvanbenthem/cdtarget-operator
operator-sdk create api --group cnad --version v1alpha1 --kind CDTarget --resource --controller
# always run make after changing *_types.go and *_controller.go
go mod tidy
make generate
make manifests

Build Operator image

# docker and github repo username
export USERNAME='bartvanbenthem'
# image and bundle version
export VERSION=1.8.0
# operator repo and name
export OPERATOR_NAME='cdtarget-operator'

source ../00-ENV/env.sh # personal setup to inject PAT
# login to ghcr.io registry
echo $CR_PAT | docker login ghcr.io -u USERNAME --password-stdin
# Build the operator image
make docker-build docker-push IMG=ghcr.io/$USERNAME/$OPERATOR_NAME:v$VERSION

Manual Operator Deployment

# test and deploy the operator

Test custom resource

# test cdtarget CR 
kubectl create ns test
# prestage the PAT (token) Secret for succesfull Azure AUTH
kubectl -n test create secret generic cdtarget-token --from-literal=AZP_TOKEN=$PAT
# apply cdtarget resource
# for scaling >1 replica don`t set the agentName field in the CR
kubectl -n test apply -f config/samples/cnad_cdtarget_sample.yaml
kubectl -n test describe cdtarget cdtarget-agent
# test CDTarget created objects
kubectl -n test describe secret cdtarget-token
kubectl -n test get configmaps
kubectl -n test get networkpolicies
kubectl -n test describe networkpolicies cdtarget-agent
kubectl -n test describe scaledobject cdtarget-agent-keda
kubectl -n test describe deployment cdtarget-agent

Create pull secret

# create regcred secret
kubectl -n test create secret docker-registry cdtarget-regcred \
          --docker-server='https://ghcr.io' \
          --docker-username='bartvanbenthem' \
          --docker-password=$CR_PAT \

Create & Update Proxy config

# update secret containing proxy settings
kubectl -n test create secret generic cdtarget-proxy --dry-run=client -o yaml \
                  --from-literal=PROXY_USER='' \
                  --from-literal=PROXY_PW='' \
                  --from-literal=PROXY_URL='' \
                  --from-literal=HTTP_PROXY='' \
                  --from-literal=HTTPS_PROXY='' \
                  --from-literal=FTP_PROXY='' \
                  --from-literal=NO_PROXY='' | kubectl apply -f -
kubectl -n test scale deployment cdtarget-agent-keda --replicas=0  

Update allowed ports

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
  name: cdtarget-ports
  namespace: cdtarget-operator
  ports: | 

kubectl -n test delete networkpolicies.networking.k8s.io cdtarget-agent-keda

Update Personal Access Token

# update CDTarget PAT
kubectl -n test create secret generic cdtarget-token --dry-run=client -o yaml \
                  --from-literal=AZP_TOKEN=$PAT | kubectl apply -f -
kubectl -n test scale deployment cdtarget-agent-keda --replicas=0  

Inject CA Certificates from file

  • Best practise is to have the ca certificate prestaged as a kubernetes secret
  • from the custom resource a reference is made to the prestaged secret
# inject CA Certificates to CDTarget agents
# in /usr/local/share/ca-certificates/
# trust store: /etc/ssl/certs/ca-certificates.crt
kubectl -n test create secret generic cdtarget-ca --dry-run=client -o yaml \
                --from-file="config/samples/CERTIFICATE.crt" | kubectl apply -f -
kubectl -n test scale deployment cdtarget-agent-keda --replicas=0  

Manual Remove Operator, CRD and CR

# cleanup test deployment
kubectl -n test delete -f config/samples/cnad_cdtarget_sample.yaml
kubectl delete ns test
# cleanup test deployment
make undeploy

Operator lifecycle manager

(instead of manual deployment)

Operator lifecycle manager Installation

# install OLM (if not already present)
operator-sdk olm install
operator-sdk olm status

Operator lifecycle manager Deployment

# Build the OLM bundle
make bundle IMG=ghcr.io/$USERNAME/$OPERATOR_NAME:v$VERSION   
make bundle-build bundle-push BUNDLE_IMG=ghcr.io/$USERNAME/$OPERATOR_NAME-bundle:v$VERSION
# Deploy OLM bundle
kubectl create ns 'cdtarget-operator'
operator-sdk run bundle ghcr.io/$USERNAME/$OPERATOR_NAME-bundle:v$VERSION --namespace='cdtarget-operator'

Remove CR, CRD & Operator Bundle

# cleanup test deployment
kubectl -n test delete -f config/samples/cnad_cdtarget_sample.yaml
kubectl delete ns test
# cleanup OLM bundle & OLM installation
operator-sdk cleanup operator --delete-all --namespace='cdtarget-operator'
kubectl delete ns 'cdtarget-operator'

Uninstall Operator Lifecycle Manager

# uninstall OLM
operator-sdk olm uninstall