This Kubernetes controller watches ConfigMaps and Secrets referenced by Deployments and StatefulSets and triggers restarts as soon as configuration or secret values change.
The k8s-manifests folder contains the necessary configuration files. Adjust them to your taste and apply in the given order:
kubectl apply -f k8s-manifests/rbac.yaml -f k8s-manifests/deployment.yaml
Optionally, create the metrics Service:
kubectl apply -f k8s-manifests/metrics-service.yaml
Automatic restart functionality is enabled on per-Deployment (StatefulSet) basis.
The only thing you need to do is set the com.xing.deployment-restart
annotation on the
desired Deployment (or StatefulSet) to enabled
:
apiVersion: apps/v1beta2
kind: Deployment
metadata:
annotations:
com.xing.deployment-restart: enabled
# the rest of the deployment manifest
Controller monitors deployment manifests and automatically watches or stops watching
relevant ConfigMaps and Secrets. It also stops restarting a deployment as soon as
annotation is removed or changed to anything else than enabled
.
Kubernetes exhibits several constraints that shaped the implementation of the controller a great deal.
Even though Kubernetes assigns a version to every resource deployed to the cluster, it is impossible to figure out which version of a particular ConfigMap or Secret was used when a Pod was started. Because of that, the controller maintains its own dataset of configuration object versions applied to Deployments and StatefulSets. Kubernetes resource versions get incremented when any part of the resource definition is changed, not just the configuration data. To avoid unnecessary restarts, the controller calculates a configuration data checksum for every ConfigMap and Secret and uses it instead of resource version to detect changes.
Another constraint is related to resource discovery. Kubernetes Informers are asynchronous and do not guarantee that the client gets the most recent cluster changes immediately. The the order in which the client gets the resource definitions is also not fixed. This makes it necessary for the controller to always operate on a potentially incomplete view of the cluster state. For example, the controller can become aware of a new Deployment resource recently created in the cluster, but yet have no information on a ConfigMap referenced in the Deployment manifest.
Every deployment configured for automatic restarts eventually gets updated with a checksums annotation. The annotation contains names and checksums of all the configs referenced by the deployment. The controller maintains this information and uses it to detect config changes.
The annotation itself is a JSON object:
metadata:
annotations:
com.xing.deployment-restart.applied-config-checksums: '{"configmap/namespace-one/config-one":"189832cc316e7594","secret/namespace-one/secret-two":"6e79832c18c31594"}'
The controller watches all ConfigMap, Secret, Deployment and StatefulSet resources in the cluster and builds a catalog of deployments and related configs in memory as resources become known to it.
Catalog entries for config resources can represent either actual resources in the cluster and have checksums associated with them, or they can be dummies merely stating the fact that some of the known deployment resources reference a particular config name.
Deployment entries in the catalog always represent real deployment objects in the cluster.
When receiving the information about another config or deployment, controller can instantiate a change object. There are several situations when it happens:
- New config is added to the catalog.
- New deployment is added to the catalog.
- Known deployment references a config that it was not referencing before.
- Known deployment does not reference a config that it was referencing before.
- Known deployment has its checksums annotation changed.
- Known config has its checksum changed.
Every change is identified by the name of the resource it was initiated by and has a timestamp and a counter associated with it.
Instantiated changes are stored in the queue without any actions to them for the amount of time determined by RESTART_GRACE_PERIOD setting. If the resource is changed again during the grace period, the change counter gets incremented.
After the grace period is exhausted, the change gets processed:
-
Using the catalog, controller discovers deployments that are potentially affected by the change. A deployment change can only affect the deployment itself, while a config change affects all the deployments that reference the config.
-
Every potentially affected deployment has its checksums annotation compared to the current state of the catalog. Based on the comparison, controller decides if the annotation should be updated and if the deployment needs to be restarted.
-
If an annotation update or a restart is necessary, controller patches the deployment. It always issues a single patch request to update the annotation and restart the deployment at the same time if necessary. A restart is triggered by setting the
com.xing.deployment-restart.timestamp
annotation inspec.template.metadata.annotations
of the deployment.
There are two situations when a deployment restart is triggered by a config update:
-
The deployment already has a checksum of the updated config in the checksums annotation, and the new checksum of the config is different. This is the "normal" situation when a config that deployment has already been referencing for some time gets updated.
-
The deployment does not have a checksum of the updated config in the checksums annotation, but the corresponding change counter is greater than one. This is the situation when a config got added to the deployment, but before the checksum got saved in the deployment annotation, the config got updated again.
-
Combined with other automation tools, controller can cause deployments to be restarted more often they need to be. For example, if some deployment pipeline takes longer than 5 seconds (by default) to update two ConfigMaps that are referenced by the same deployment, the deployment will be restarted twice. This can be mitigated by either changing the pipeline or increasing the restart check period.
-
When forcefully terminated, the controller might miss some restarts. Consider the situation: a new deployment is added to the cluster. Soon after that, a config referenced by that deployment is updated, and while that change is being on hold for the grace period, controller gets forcefully terminated. The fact that there was a config change observed would not be stored anywhere, and another instance of the controller will just mark the config as already applied to the deployment. The chances of this happening are rather low since controller needs to be killed exactly during the grace period of a deployment change followed by a config change.
The controller exposes several metrics at 0.0.0.0:10254/metrics
endpoint in Prometheus
format. These metrics can be used to monitor the controller status and observe actions
that it takes.
Metric | Type | Description |
---|---|---|
deployment_restart_controller_resource_versions_total | counter | The number of distinct resource versions observed. |
deployment_restart_controller_configs_total | gauge | The number of tracked configs. |
deployment_restart_controller_deployments_total | gauge | The number of tracked deployments. |
deployment_restart_controller_deployment_annotation_updates_total | counter | The number of deployment annotation updates. |
deployment_restart_controller_deployment_restarts_total | counter | The number of deployment restarts triggered. |
deployment_restart_controller_changes_processed_total | counter | The number of resource changes processed. |
deployment_restart_controller_changes_waiting_total | gauge | The number of changes waiting in the queue. |
Usage:
kubernetes-deployment-restart-controller [OPTIONS]
Application Options:
-c, --restart-check-period= Time interval to check for pending restarts in milliseconds (default: 500) [$RESTART_CHECK_PERIOD]
-r, --restart-grace-period= Time interval to compact restarts in seconds (default: 5) [$RESTART_GRACE_PERIOD]
-v, --verbose= Be verbose [$VERBOSE]
--version Print version information and exit
Help Options:
-h, --help Show this help message
This project uses go modules introduced by go 1.11. Please put the project somewhere outside of your GOPATH to make go automatically recogninze this.
All build and install steps are managed in the Makefile. make test
will
fetch external dependencies, compile the code and run the tests. If all goes well, hack
along and submit a pull request. You might need to run the go mod tidy
after updating
dependencies.
Releases are a two-step process, beginning with a manual step:
- Create a release commit
- Increase the version number in kubernetes-deployment-restart-controller.go/VERSION
- Adjust the CHANGELOG
- Run
make release
, which will create an image, retrieve the version from the binary, create a git tag and push both your commit and the tag
The Travis CI run will then realize that the current tag refers to the current master commit and will tag the built docker image accordingly.