aws-progress: A repository from adhil0

Helpful Resources:

Repository and blog post that has a Terraform config that creates a JupyterHub cluster on AWS

Issues I’m having:

Labelling and tainting worker groups (not node groups)
deleting VPN subnets when terraform destroy is run. Open issue here
- Try to delete manually in console, need to delete load balancer as well to delete subnets
Enabling cluster autoscaling through Terraform. The above blog post has a config that has autoscaling, but I had trouble getting it to work.

Common errors:

terraform-aws-modules/terraform-aws-eks#1234
If you are having issues with https, try having

proxy:
  https:
    enabled: false

in the YAML file.

Support for managed node group taints was added to the EKS Terraform module within the last few weeks, but managed node groups can’t scale to 0, so a worker group config similar to this repo might work better.

Forked version of the blog post repo with updated code: https://github.com/adhil0/terraform-deploy. Check aws-examples/minimal-deployment-tutorial/ for the code.

How to manually build an autoscaling EKS Cluster for JupyterHub:

Autoscaler Steps for your Reference

eksctl create cluster \
--name <my-cluster> \
--version <1.18> \
--region <us-west-2> \
--nodegroup-name <linux-nodes> \
--nodes <3> \
--nodes-min <1> \
--nodes-max <4> \
--with-oidc \
--ssh-access \
--ssh-public-key <name-of-ec2-keypair> \
--managed \
--asg-access

kubectl create clusterrolebinding cluster-system-anonymous --clusterrole=cluster-admin  —-user=system:anonymous

Policy name is in the IAM Role for node group (eksctl-autoscaling-test-cluster-n-NodeInstanceRole-...). For me it was called AmazonEKSClusterAutoscalerPolicy.

eksctl create iamserviceaccount \
  --cluster=<my-cluster> \
  --namespace=kube-system \
  --name=cluster-autoscaler \
  --attach-policy-arn=arn:aws:iam::<AWS_ACCOUNT_ID>:policy/<AmazonEKSClusterAutoscalerPolicy> \
  --override-existing-serviceaccounts \
  --approve

kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml

kubectl annotate serviceaccount cluster-autoscaler \
  -n kube-system \
  eks.amazonaws.com/role-arn=arn:aws:iam::<AWS_ACCOUNT_ID>:role/<AmazonEKSClusterAutoscalerRole>

Step 6 may not be necessary. Eksctl might take care of this on your behalf.

kubectl patch deployment cluster-autoscaler \
  -n kube-system \
  -p '{"spec":{"template":{"metadata":{"annotations":{"cluster-autoscaler.kubernetes.io/safe-to-evict": “false"}}}}}'

For step 7, use step 4 here

kubectl -n kube-system edit deployment.apps/cluster-autoscaler

kubectl set image deployment cluster-autoscaler \
  -n kube-system \
  cluster-autoscaler=k8s.gcr.io/autoscaling/cluster-autoscaler:v<1.19.n>

Step 9 verifies that the autoscaler is working 9.

kubectl -n kube-system logs -f deployment.apps/cluster-autoscaler

Add labels and taints. kubectl label nodes <your-node-name> hub.jupyter.org/node-purpose=user kubectl taint nodes <your-node-name> hub.jupyter.org/dedicated=user:NoSchedule

helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
helm repo update

# Suggested values: advanced users of Kubernetes and Helm should feel
# free to use different values.
RELEASE=jhub
NAMESPACE=jhub

helm upgrade --cleanup-on-fail \
  --install $RELEASE jupyterhub/jupyterhub \
  --namespace $NAMESPACE \
  --create-namespace \
  --version=0.9.0 \
  --values config.yaml

kubectl get service --namespace jhub

eksctl delete cluster —name <my-cluster>

adhil0/aws-progress