The Bottlerocket update operator is a Kubernetes operator that coordinates Bottlerocket updates on hosts in a cluster.
To run the update operator in a Kubernetes cluster, the following are required resources and configuration (suggested deployment is defined in update-operator.yaml
):
-
Update operator's container image
Holding the Operator's binaries and supporting environment (CA certificates).
-
Controller deployment
Schedules a stop-restart-tolerant controller process on available nodes.
-
Agent daemon set
Schedules agent on Bottlerocket hosts
-
Bottlerocket namespace
Groups Bottlerocket related resources and roles.
-
Service account for the agent
Used for authenticating the agent process on Kubernetes APIs.
-
Cluster privileged credentials with read-write access to nodes for the agent
Grants the agent's service account permissions to update annotations for its node.
-
Service account for the controller
Used for authenticating the controller process on Kubernetes APIs.
-
Cluster privileged credentials with access to pods and nodes for controller
Grants the controller's service account permissions to update annotations and manage pods that are scheduled on nodes (to cordon & drain) before and after updating.
Cluster administrators can deploy the update operator with suggested configuration defined here - this includes the above resources and Bottlerocket published container images.
With kubectl
configured for the desired cluster, the suggested deployment can made with:
kubectl apply -f ./update-operator.yaml
Once the deployment's resources are in place, there is one more step needed to schedule and place the required pods on Bottlerocket nodes.
By default - in the suggested deployment, each Workload resource constrains scheduling of the update operator by limiting pods to Bottlerocket nodes based on their labels.
These labels are not applied on nodes automatically and will need to be set on each using kubectl
.
The agent relies on each node's updater components and schedules its pods based on their interface supported.
The node indicates its updater interface version in a label called bottlerocket.aws/updater-interface-version
.
Agent deployments, respective to the interface version, are scheduled using this label and target only a single version in each.
For the 1.0.0
updater-interface-version
, this label looks like:
bottlerocket.aws/updater-interface-version=1.0.0
kubectl
can be used to set this label on a node in the cluster:
kubectl label node $NODE_NAME bottlerocket.aws/updater-interface-version=1.0.0
If all nodes in the cluster are running Bottlerocket and have the same updater-interface-version
, all can be labeled at the same time:
kubectl label node $(kubectl get nodes -o jsonpath='{.items[*].metadata.name}') bottlerocket.aws/updater-interface-version=1.0.0
Each workload resource may have additional constraints or scheduling affinities based on each node's labels in addition to the bottlerocket.aws/updater-interface-version
label scheduling constraint.
Customized deployments may use the suggested deployment or the example development deployment as a starting point, with customized container images specified if needed.
The update operator system is deployed as set of a replica set (for the controller) and a daemon set (for the agent).
Each runs their respective process configured as either a -controller
or an -agent
:
-
bottlerocket-update-operator -controller
The coordinating process responsible for the handling update of Bottlerocket nodes cooperatively with the cluster's workloads.
-
bottlerocket-update-operator -agent
The on-host process responsible for publishing update metadata and executing update activities.
The update operator controller and agent processes communicate by updating the node's annotations as the node steps through an update.
The node's annotations are used to communicate an intent
which acts as a goal or target that is set by the controller.
The controller uses internal policy checks to manage which intent
should be communicated to an agent.
This allows the controller to fully own and coordinate each step taken by agents throughout its cluster.
No agent process will otherwise take any disruptive or intrusive action without being directed by the controller to do so (in fact the agent is limited to periodic metadata updates only).
To handle and respond to intent
s, the agent and controller processes subscribe to Kubernetes' node resource update events.
These events are emitted whenever update is made on the subscribed to resource, including: heartbeats, other node status changes (pods, container image listing), and metadata changes (labels and annotations).
The update operator's state can be closely monitored through the labels and annotations on node resources.
The state and pending activity are updated as progress is being made.
The following command requires kubectl
to be configured for the development cluster to be monitored and jq
to be available on $PATH
.
kubectl get nodes -o json \
| jq -C -S '.items | map(.metadata|{(.name): (.annotations*.labels|to_entries|map(select(.key|startswith("bottlerocket.aws")))|from_entries)}) | add'
There is a get-nodes-status
Makefile
target provided for monitoring nodes during development.
Note: the same dependencies and assumptions for the above command apply here.
# get the current status:
make get-nodes-status
# or periodically (handy for watching closely):
watch -c -- make get-nodes-status
- pod replication & healthy count is not taken into consideration (https://github.com/bottlerocket-os/bottlerocket/issues/502)
- nodes update without pause between each node (https://github.com/bottlerocket-os/bottlerocket/issues/503)
- single node cluster degrades into unscheduleable on update (https://github.com/bottlerocket-os/bottlerocket/issues/501)
- node labels are not automatically applied to allow scheduling (https://github.com/bottlerocket-os/bottlerocket/issues/504)
Working on the update operator requires a fully configured & working Kubernetes cluster.
For the sake of development workflow, we suggest using a cluster that is containerized or virtualized.
There are helpful tools available to manage these: kind
for containerized clusters and minikube
for locally virtualized clusters.
The dev/
directory contains several resources that may be used for development and debugging purposes:
dashboard.yaml
- development dashboard deployment (using insecure settings, not a suitable production deployment)deployment.yaml
- template for Kubernetes resources that schedule a controller'sReplicaSet
and agent'sDaemonSet
kind-cluster.yml
-kind
cluster definition that may be used to stand up a local development cluster
Much of the development workflow can be driven by the Makefile
in the root of the repository.
Each of the Makefile
's' targets use tools and environments that they're configured to access - for example: kubectl
, as configured on a host, will be used.
If kubectl
is configured to configured with access to production, please take steps to configure kubectl
to target a development cluster.
Build targets
build
- build executable using go toolchain in$PATH
test
- rungo test
for the operator using go toolchain in$PATH
container
- build a container image for use in Kubernetes resourcescontainer-test
- run update operator's unit tests in a containercheck
- run checks for container imagedist
- create a distribution archive of the container imageclean
- remove cached build artifacts from workspace
Development targets
dashboard
- create or update Kubernetes-dashboard (not suitable for use in production)deploy-dev
- create or update the operator's Kubernetes resourcesrollout
- reload and restart the operator's pods
kind
development targets
kind-cluster
- create a localkind
clusterkind-load
- build and load container image for use in akind
clusterkind-rollout
- reload container image & config, then restart pods