This repo is an amateur exploratory setup of a clustered etcd cluster in Kubernetes.
DO NOT USE THIS FOR PRODUCTION, this is just me playing around.
Note: This assumes you're using Azure Kubernetes Service
$SUBSCRIPTION_NAME="<Name of the subscription where your AKS cluster lives here>"
$AKS_RG_NAME="<Name of the resource group where your AKS cluster lives here>"
$AKS_NAME="<Name of your AKS cluster here>"
az login
az account set -s $SUBSCRIPTION_NAME
az aks get-credentials --admin -n $AKS_NAME -g $AKS_RG_NAME
Note: Everything below works exactly the same for etcd v2 and etcd v3, just swap out all
-2
for -3` cd into the appropriate folder in this repo.
kubectl config set-context --current --namespace etcd-2
kubectl create namespace etcd-2
cd etcd-2.3.8
kubectl apply -f stateful-set.yaml
This options entails creating a separate pod in the cluster in which you'll install etcdctl to talk to etcd from within K8s.
# Run a temporary pod to test out etcd
kubectl run -it --rm shell --image=ubuntu:20.04 bash
# Install etcdctl
apt update && apt install -y etcd-client
POD_IP=<COPY POD IP FROM ABOVE>
etcdctl --endpoints=http://etcd.etcd-2.cluster.svc.local:4001 ls
etcdctl --endpoints=http://etcd.etcd-2.cluster.svc.local:4001 set mykey-local myvalue-local
etcdctl --endpoints=http://etcd.etcd-2.cluster.svc.local:4001 get mykey-local
etcdctl --endpoints=http://etcd.etcd-2.cluster.svc.local:4001 ls
This options entails using K8s' port forwarding to connect to one of the etcd replicas and talk to it directly. This unfortunately requires tweaking the etcd config to recognize localhost as an address it should respond to and thus is a much less desirable option.
Note: In order for this to work, need to add
127.0.0.1:4001
toETCD_ADVERTISE_CLIENT_URLS
instatefulset.yaml
as follows:- name: ETCD_ADVERTISE_CLIENT_URLS value: "http://$(HOST_IP):2379,http://$(HOST_IP):4001,http://etcd.etcd-2.svc.cluster.local:4001,http://127.0.0.1:4001"
kubectl port-forward pod/etcd-0 4001
# Install etcdctl on your host if necessary via apt install etcd-client
etcdctl --endpoints=http://localhost:4001 ls
etcdctl --endpoints=http://localhost:4001 set mykey-remote myvalue-remote
etcdctl --endpoints=http://localhost:4001 get mykey-remote
etcdctl --endpoints=http://localhost:4001 ls
Obviously there's the more production grade option of exposing the service as a load balancer, you can do that by modifyin the service as follows:
- Removing
spec.clusterIP: None
entry (need to confirm this) - Add
spec.type: LoadBalancer
- (Azure only) Add
metadata.annotations.service.beta.kubernetes.io/azure-dns-label-name: your-unique-dns-label
per AKS Docs on Using a Load Balancer - Add Auth (don't be another casualty of exposed assets on the internet) and a bunch of other stuff to make it real like custom domain name, etc, etc.
At this point, we're setting up etcd using static discovery
as observed from the env:
section of stateful-set.yaml
.
This means that you can't use kubectl scale sts etcd --replicas=SOMEVALUE
.
- If
SOMEVALUE
is greater than the starting number (and the list of entries in the env varETCD_INITIAL_CLUSTER
, the new pods will error out indicating couldn't find local name "etcd-SOMEVALUE" in the initial cluster configuration. You can still recover from this by scaling back to the original number of replicas. - If
SOMEVALUE
is less than the starting number but still enough to achieve quorum, everything will continue to work though the primary replica will endlessly show errors in the log that it can't communicate with the missing replicas. You can't recover from this by scaling back to the original number of replicas as the new ones that kubernetes creates will fail registartion with the error member XXX has already been bootstrapped. At this point you need to delete all the pods or the stateful set. - If
SOMEVALUE
is less than what's needed for quorum for the starting number (and the list of entries in the env varETCD_INITIAL_CLUSTER
, the remaining pods cycle starting new elections and unable to get quorum. Read commands will work but writes won't. You can't recover from this by scaling back to the original number of replicas as the new ones that kubernetes creates will fail registration with the error member XXX has already been bootstrapped. At this point you need to delete all the pods or the stateful set.
It's entirely possible some tweaks can be done to the initial setup or something else to support scaling but that's TBD.
For now, scaling requires manually updating the stateful-set.yaml
file with the following changes:
- Update
spec.replicas
accordingly - Update
spec.template.spec.containers[0].env
with nameETCD_INITIAL_CLUSTER
to add/remove entries to matchreplicas
- 1.
Then you can apply the update as follows:
IMPORTANT: You need to delete the existing stateful-set first otherwise, given Kubernetes' rollout policy, you mmight get conflicts between the replicas with the old config and the new replicas with the new config. TODO fix here is to generate a unique
ETCD_INITIAL_CLUSTER_TOKEN
(maybe?)
kubectl delete sts etcd
kubectl apply -f stateful-set.yaml
Note: Btw, you can also check out the sad first stab at this using k8s deployments instead of stateful sets, in which the instances weren't clustered :P For that, check out the NonClustered tag
- Try out the discovery service to see I can get scaling working
- Add auth?