This experimental project introduces a new Kubernetes controller, named
LinkerdConfig
, that knows how to reconcile the Linkerd's configuration.
The LinkerdConfig
controller can be used to automatically:
- propagate new configuration to annotated proxies in the data plane
- revert any manual unsolicited changes made to the
linkerd-config
configmap
This project is tested with the following software:
In this scenario, we will auto-upgrade the proxy version of all opt-in pods in the data plane with the following steps:
- Install Linkerd edge-19.12.1
- Deploy the
LinkerdConfig
controller - Install the
edge-19.12.3
LinkerdConfig
custom resource - Let the
LinkerdConfig
controller reconcile thelinkerd-config
map with theedge-19.12.3
custom resource - Let the
LinkerdConfig
controller restart all opt-in emojivoto pods
Set up a Kind cluster, named linkerd
:
make kind-cluster
(The name of the cluster can be overriden using the KIND_CLUSTER
variable.)
Install cert-manager to manage the CA bundle of the controller's webhooks:
make cert-manager
(This is optional if you already have your own cert-manager.)
Install Linkerd:
linkerd install | kubectl apply -f -
linkerd check
linkerd version
Client version: edge-19.12.1
Server version: edge-19.12.1
Label the kube-system
namespace so that the LinkerdConfig
controller will
ignore all the system pods during reconciliation:
kubectl label ns kube-system config.linkerd.io/admission-webhooks=disabled
Use the following command to retrieve the mTLS trust anchor generated by Linkerd:
kubectl -n linkerd get cm linkerd-config -ojsonpath={.data.global} | jq -r .identityContext.trustAnchorsPem
Save the certificate in the spec.global.identityContext.trustAnchorsPEM
field
of the config/samples/edge_19.12.3.yaml
file.
(See Future Work for better ways to do this.)
Deploy the linkerdconfigs
custom resource definition:
make install
kubectl get crd linkerdconfigs.config.linkerd.io
NAME CREATED AT
linkerdconfigs.config.linkerd.io 2020-01-02T03:48:03Z
Build and deploy the LinkerdConfig
controller:
make controller
kubectl -n linkerd get po linkerd-config-controller-manager-5b54566647-cqz9h
NAME READY STATUS RESTARTS AGE
linkerd-config-controller-manager-5b54566647-cqz9h 3/3 Running 0 91s
Install and inject the emojivoto application:
make emojivoto
# confirm the proxy version is at edge-19.12.1.
# this will be auto-upgraded later.
kubectl -n emojivoto get po -ojsonpath='{range .items[*]}{.spec.containers[1].image}{"\n"}'
gcr.io/linkerd-io/proxy:edge-19.12.1
gcr.io/linkerd-io/proxy:edge-19.12.1
gcr.io/linkerd-io/proxy:edge-19.12.1
gcr.io/linkerd-io/proxy:edge-19.12.1
Note that the Deployment
s' pod templates are labeled with the
config.linkerd.io/reconcile: auto
label.
kubectl -n emojivoto get po -ocustom-columns="reconcilation mode:.metadata.labels['config\.linkerd\.io\/reconcile']"
reconcilation mode
auto
auto
auto
auto
Install the edge_19.12.3
custom resource:
kubectl apply -f config/samples/edge_19.12.3.yaml
kubectl get linkerdconfig edge-19.12.3
NAME CONTROL PLANE NAMESPACE CONFIGMAP
edge-19.12.3 linkerd linkerd-config
Notice that the linkerd-config
configmap's data has been updated to match the
defaults defined in the edge-19.12.3
custom resource:
kubectl -n linkerd describe cm linkerd-config |less
...
Data
====
global:
----
{"linkerdNamespace":"linkerd","cniEnabled":false,"version":"edge-19.12.3",...
In addition, it also has an ownerReference
pointing to the custom resource:
kubectl -n linkerd get cm linkerd-config -oyaml | less
...
ownerReferences:
- apiVersion: config.linkerd.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: LinkerdConfig
name: edge-19.12.3
uid: 89235148-79e0-4120-b10a-c682f3b96db9
Take a look at your emojivoto application. All the pods should have auto-restarted:
# confirm that the proxy version is updated to edge-19.12.3
kubectl -n emojivoto get po -ojsonpath='{range .items[*]}{.spec.containers[1].image}{"\n"}'
gcr.io/linkerd-io/proxy:edge-19.12.3
gcr.io/linkerd-io/proxy:edge-19.12.3
gcr.io/linkerd-io/proxy:edge-19.12.3
gcr.io/linkerd-io/proxy:edge-19.12.3
The following are some implementation highlights:
- The controller watches the
linkerd-config
configmap. It makes theedge-19.12.3
custom resource the owner of this configmap - When the
edge-19.12.3
custom resource is created or updated, the controller:- overrides the configmap data with the defaults defined in the
edge-19.12.3
custom resource. - restarts all injected pods that are labeled with the
config.linkerd.io/reconcile: auto
label. Note that these pods must not reside in namespaces which have theconfig.linkerd.io/admission-webhooks: disabled
label
- overrides the configmap data with the defaults defined in the
- Any changes to the
linkerd-config
configmap will also trigger the same reconciliation process - The controller sets up a
FieldIndexer
on the pods'Phase
field so that theClient
can quickly and efficiently query forRunning
pods - Predicate is used to respond to only "resource version changed" events of the custom resources
- Important events are published to the K8s event bus using the
Recorder
- A mutating webhook is used to provide defaults to required fields in the custom resource
- A validating webhook is used to validate required user-managed fields (e.g., mTLS trust anchors)
- The controller and all its namespaced resources are installed in the
linkerd
namespace - The controller is injected with the Linkerd proxy
To deploy the CRDs to K8s:
make install
To run the controller in standalone mode, outside of K8s:
make run ENABLE_WEBHOOKS=false
To build the controller Docker image, and load it into a Kind cluster:
make controller
To remove the controller's Deployment
and other resources (e.g. RBAC):
make clean
The following is a list of future work:
- Since the
linkerd-config
configmap has anownerReference
pointing to theedge-19.12.3
custom resource, the deletion of the custom resource will trigger a cascading delete on the configmap. The configmap's lifecycle should be managed independently to avoid breaking the control plane. Alternately, we can use the custom resource to manage thelinkerd-config
configmap, implying that the configmap should be removed from the Linkerd installation process - Currently, the controller relies on
time.Sleep()
to delay the restarting of the pods, so that the proxy-injector has time to pick up the updates to thelinkerd-config
configmap. A better way to handle this is to update the proxy-injector to watch the configmap for changes - Instead of manually retrieve the mTLS trust anchor from an existing configmap so that it can be reused in the custom resource, the controller should automatically check for and reuse any existing mTLS trust anchor. Alternately, we can let the custom resource generate and fully manage the
linkerd-config
configmap - The desired behaviour of reconciling the configuration with multiple
linkerdconfigs
custom resources is yet to be determined. We can either use the most-recent custom resource, or one that is labeled asactive
, or perform some form of merges among all the resources - Reuse the data structure defined in the Linkerd
config.proto
file in the custom resource definition. Currently, the controller have to convert the custom resource definition into the protobuf format that Linkerd can consume