Helpful Resources:

Repository and blog post that has a Terraform config that creates a JupyterHub cluster on AWS

Issues I’m having:

  • Labelling and tainting worker groups (not node groups)
  • deleting VPN subnets when terraform destroy is run. Open issue here
    • Try to delete manually in console, need to delete load balancer as well to delete subnets
  • Enabling cluster autoscaling through Terraform. The above blog post has a config that has autoscaling, but I had trouble getting it to work.

Common errors:

  1. terraform-aws-modules/terraform-aws-eks#1234

  2. If you are having issues with https, try having

proxy:
  https:
    enabled: false

in the YAML file.

Support for managed node group taints was added to the EKS Terraform module within the last few weeks, but managed node groups can’t scale to 0, so a worker group config similar to this repo might work better.

Forked version of the blog post repo with updated code: https://github.com/adhil0/terraform-deploy. Check aws-examples/minimal-deployment-tutorial/ for the code.


How to manually build an autoscaling EKS Cluster for JupyterHub:

Autoscaler Steps for your Reference

eksctl create cluster \
--name <my-cluster> \
--version <1.18> \
--region <us-west-2> \
--nodegroup-name <linux-nodes> \
--nodes <3> \
--nodes-min <1> \
--nodes-max <4> \
--with-oidc \
--ssh-access \
--ssh-public-key <name-of-ec2-keypair> \
--managed \
--asg-access
kubectl create clusterrolebinding cluster-system-anonymous --clusterrole=cluster-admin  —-user=system:anonymous

Policy name is in the IAM Role for node group (eksctl-autoscaling-test-cluster-n-NodeInstanceRole-...). For me it was called AmazonEKSClusterAutoscalerPolicy.

eksctl create iamserviceaccount \
  --cluster=<my-cluster> \
  --namespace=kube-system \
  --name=cluster-autoscaler \
  --attach-policy-arn=arn:aws:iam::<AWS_ACCOUNT_ID>:policy/<AmazonEKSClusterAutoscalerPolicy> \
  --override-existing-serviceaccounts \
  --approve
kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml
kubectl annotate serviceaccount cluster-autoscaler \
  -n kube-system \
  eks.amazonaws.com/role-arn=arn:aws:iam::<AWS_ACCOUNT_ID>:role/<AmazonEKSClusterAutoscalerRole>

Step 6 may not be necessary. Eksctl might take care of this on your behalf.

kubectl patch deployment cluster-autoscaler \
  -n kube-system \
  -p '{"spec":{"template":{"metadata":{"annotations":{"cluster-autoscaler.kubernetes.io/safe-to-evict": “false"}}}}}'

For step 7, use step 4 here

kubectl -n kube-system edit deployment.apps/cluster-autoscaler
kubectl set image deployment cluster-autoscaler \
  -n kube-system \
  cluster-autoscaler=k8s.gcr.io/autoscaling/cluster-autoscaler:v<1.19.n>

Step 9 verifies that the autoscaler is working 9.

kubectl -n kube-system logs -f deployment.apps/cluster-autoscaler
  1. Add labels and taints. kubectl label nodes <your-node-name> hub.jupyter.org/node-purpose=user kubectl taint nodes <your-node-name> hub.jupyter.org/dedicated=user:NoSchedule

helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
helm repo update
# Suggested values: advanced users of Kubernetes and Helm should feel
# free to use different values.
RELEASE=jhub
NAMESPACE=jhub

helm upgrade --cleanup-on-fail \
  --install $RELEASE jupyterhub/jupyterhub \
  --namespace $NAMESPACE \
  --create-namespace \
  --version=0.9.0 \
  --values config.yaml
kubectl get service --namespace jhub
eksctl delete cluster —name <my-cluster>