/voxhub

Testing deployment of JupyterHub

Primary LanguageShell

voxhub

Testing deployment of JupyterHub

See: zero-to-jupyterhub.readthedocs.io

File with variables

Create this file, which will keep variables used in the setup.

Later, we will source the variables to oour current shell.

source 01_vars.sh

Creating a Kubernetes Cluster at Google

See setting-up-kubernetes-on-google-cloud.

Open console.cloud.google.com

Create a project, and save variables from Google Project name

echo "# The Google Cloud Project name" >> 01_vars.sh
echo "G_PROJECT_NAME=Name" >> 01_vars.sh
echo "# The Google Cloud  Project ID" >> 01_vars.sh
echo "G_PROJECT_ID=Name-123456" >> 01_vars.sh
echo "# The Google Cloud  Project Number" >> 01_vars.sh
echo "G_PROJECT_NR=123456789012" >> 01_vars.sh

Source the variables to your current shell.

source 01_vars.sh
echo $G_PROJECT_NAME

Enable the Kubernetes Engine API.

container.googleapis.com/overview

Install kubectl

gcloud components install kubectl

Spin up servers

First get at list of machine-types.

Pricing is here

Zones is explained here

Region europe-west1 is in St. Ghislain, Belgium

gcloud compute regions list
gcloud compute zones list

gcloud compute machine-types list | head
gcloud compute machine-types list | grep europe
compute machine-types list | grep europe-west1
gcloud compute machine-types list | grep europe-west1 | grep n1-standard-2


gcloud compute regions list

Save variables for Kubernetes engine. You need to start minimum 1 nodes, but with at least 7.5GB Memory.

echo "# The Google Cloud Kubernetes Name" >> 01_vars.sh
echo "G_KUBE_NAME=cluster-1" >> 01_vars.sh
echo "# The Google Cloud Kubernetes Region" >> 01_vars.sh
echo "G_KUBE_REGION=europe-west1" >> 01_vars.sh
echo "# The Google Cloud Kubernetes Zone" >> 01_vars.sh
echo "G_KUBE_ZONE=europe-west1-b" >> 01_vars.sh
echo "# The Google Cloud Kubernetes cluster-version" >> 01_vars.sh
echo "G_KUBE_CLUSTERVERSION=1.8.7-gke.1" >> 01_vars.sh
echo "# The Google Cloud Kubernetes machine-type" >> 01_vars.sh
echo "G_KUBE_MACHINETYPE=n1-standard-2" >> 01_vars.sh
echo "# The Google Cloud Kubernetes number of nodes" >> 01_vars.sh
echo "G_KUBE_NUMNODES=1" >> 01_vars.sh

Source the variables to your current shell.

source 01_vars.sh
echo $G_KUBE_NAME
echo $G_KUBE_ZONE
echo $G_KUBE_CLUSTERVERSION 
echo $G_KUBE_MACHINETYPE
echo $G_KUBE_NUMNODES

See configurations

gcloud config configurations list
gcloud config configurations describe default

Create service account serviceaccounts

open https://console.cloud.google.com/iam-admin/serviceaccounts/project?project=${G_PROJECT_ID}
# After download, move it to ssh
ls -la $HOME/Downloads/${G_PROJECT_NAME}-*.json
mv $HOME/Downloads/${G_PROJECT_NAME}-*.json $HOME/.ssh/
ls -la $HOME/.ssh/${G_PROJECT_NAME}-*.json

Create configuration

# List
gcloud config configurations list
# Create empty
gcloud config configurations create $G_PROJECT_NAME
gcloud config configurations list
# Specify
gcloud auth activate-service-account --key-file $HOME/.ssh/${G_PROJECT_NAME}-*.json
gcloud config set compute/region ${G_KUBE_REGION}
gcloud config set compute/zone ${G_KUBE_ZONE}
# Get setup
gcloud config configurations describe $G_PROJECT_NAME

Create Cluster

gcloud config set project $G_PROJECT_ID

gcloud container clusters create $G_KUBE_NAME \
    --zone=$G_KUBE_ZONE \
    --cluster-version=$G_KUBE_CLUSTERVERSION \
    --machine-type=$G_KUBE_MACHINETYPE \
    --num-nodes=$G_KUBE_NUMNODES \

Or with cheaper preemptible-vm

gcloud config set container/use_v1_api_client false
gcloud config set project $G_PROJECT_ID

gcloud beta container clusters create $G_KUBE_NAME \
    --zone=$G_KUBE_ZONE \
    --cluster-version=$G_KUBE_CLUSTERVERSION \
    --machine-type=$G_KUBE_MACHINETYPE \
    --num-nodes=$G_KUBE_NUMNODES \
gcloud container clusters list

Inspect in kupectl

Also see working-with-multiple-projects-on-gke

Get kubectl ready by getting GKE credentials for the project

gcloud container clusters get-credentials $G_KUBE_NAME --zone ${G_KUBE_ZONE} --project $G_PROJECT_ID
kubectl config get-contexts
kubectl config current-context

kubectl get nodes
kubectl get services

See accessing-the-api

# Get email of current service user
G_KUBE_SERVICE_USER=$(gcloud config get-value account)
echo $G_KUBE_SERVICE_USER
echo "# The Google Cloud Kubernetes Service user " >> 01_vars.sh
echo "G_KUBE_SERVICE_USER=$G_KUBE_SERVICE_USER" >> 01_vars.sh

First see roles permission in google console.

open https://console.cloud.google.com/iam-admin/roles/project?project=${G_PROJECT_ID}

Filter by: 'Kubernetes'. Look for 'Kubernetes Engine'. Open them, search for "container.clusterRoleBindings.create". The role "Kubernetes Engine Admin" has this rule.

Then add this "Kubernetes Engine Admin" role to $G_KUBE_SERVICE_USER:

echo $G_KUBE_SERVICE_USER
open https://console.cloud.google.com/iam-admin/iam/project?project=${G_PROJECT_ID}

Give your account super-user permissions, allowing you to perform all the actions needed to set up JupyterHub.

kubectl create clusterrolebinding cluster-admin-binding \
    --clusterrole=admin \
    --user=$G_KUBE_SERVICE_USER

Setup Helm

See setup-helm

Helm, the package manager for Kubernetes, is a useful tool to install, upgrade and manage applications on a Kubernetes cluster. We will be using Helm to install and manage JupyterHub on our cluster.

curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get > install-helm.bash
bash install-helm.bash --version v2.6.2

Set up a ServiceAccount for use by Tiller, the server side component of helm.

kubectl --namespace kube-system create serviceaccount tiller
# Give the ServiceAccount RBAC full permissions to manage the cluser.
kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
#Set up Helm on the cluster. This command only needs to run once per Kubernetes cluster.
helm init --service-account tiller

Verify helm

helm version

Secure Helm

Ensure that tiller is secure from access inside the cluster:

kubectl --namespace=kube-system patch deployment tiller-deploy --type=json --patch='[{"op": "add", "path": "/spec/template/spec/containers/0/command", "value": ["/tiller", "--listen=localhost:44134"]}]'

Install JupyterHub!

See setup-jupyterhub

Create a random hex string to use as a security token.

RANDHEX=`openssl rand -hex 32`
echo $RANDHEX

Create config.yaml

echo "proxy:" > config.yaml
echo "  secretToken: '$RANDHEX'" >> config.yaml
cat config.yaml

Let’s add the JupyterHub helm repository to your helm, so you can install JupyterHub from it.
This makes it easy to refer to the JupyterHub chart without having to use a long URL each time.

helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
helm repo update

Now you can install the chart!
Run this command from the directory that contains the config.yaml file to spin up JupyterHub:

Save variables for Helm.

G_KUBE_CURCONT=`kubectl config current-context`
echo $G_KUBE_CURCONT
echo "# The Google Cloud Kubernetes current context " >> 01_vars.sh
echo "G_KUBE_CURCONT=$G_KUBE_CURCONT" >> 01_vars.sh

G_KUBE_NAMESPACE=$G_PROJECT_NAME
echo $G_KUBE_NAMESPACE
echo "# The Google Cloud Kubernetes namespace " >> 01_vars.sh
echo "G_KUBE_NAMESPACE=$G_KUBE_NAMESPACE" >> 01_vars.sh

# The relase name must NOT contain underscores "_"
echo "# The Helm version " >> 01_vars.sh
echo "H_VERSION=v0.6" >> 01_vars.sh
echo "# The Helm release name " >> 01_vars.sh
echo "H_RELEASE=jup-01" >> 01_vars.sh

source 01_vars.sh

Create name namespace

# See first
kubectl get namespaces
# This should be empty
kubectl config view | grep namespace:
# create
kubectl config set-context $G_KUBE_CURCONT --namespace=$G_KUBE_NAMESPACE
kubectl config view | grep namespace:

Install

helm install jupyterhub/jupyterhub --version=$H_VERSION --name=$H_RELEASE --namespace=$G_KUBE_NAMESPACE -f config.yaml

Verify

You can find if the hub and proxy is ready by doing:

kubectl --namespace=$G_KUBE_NAMESPACE get pod

# Bug finding
HUB=`kubectl --namespace=$G_KUBE_NAMESPACE get pod | grep "hub-" | cut -d " " -f1`
kubectl describe pods $HUB
PROXY=`kubectl --namespace=$G_KUBE_NAMESPACE get pod | grep "proxy-" | cut -d " " -f1`
kubectl describe pods $PROXY
PREPULL1=`kubectl --namespace=$G_KUBE_NAMESPACE get pod | grep "pre-" | head -n 1 | tail -n 1 | cut -d " " -f1`
kubectl describe pods $PREPULL1
PREPULL2=`kubectl --namespace=$G_KUBE_NAMESPACE get pod | grep "pre-" | head -n 2 | tail -n 1 | cut -d " " -f1`

You can find the public IP of the JupyterHub by doing:

kubectl --namespace=$G_KUBE_NAMESPACE get svc
kubectl --namespace=$G_KUBE_NAMESPACE get svc proxy-public
kubectl --namespace=$G_KUBE_NAMESPACE describe svc proxy-public

Get the external IP and open

EXTIP=`kubectl --namespace=$G_KUBE_NAMESPACE get svc proxy-public | grep "LoadBalancer" | awk '{print $4}'`
open http://$EXTIP

JupyterHub is running with a default dummy authenticator so entering any username and password combination will let you enter the hub.

If it is NOT running. Check the pods with kubectl describe pods x. Is there any memory errors?

Turning Off JupyterHub and Computational Resources

See turn-off

helm delete $H_RELEASE --purge
kubectl delete namespace $G_KUBE_NAMESPACE

# Delete clusters
gcloud container clusters list
gcloud container clusters delete $G_KUBE_NAME --zone=$G_KUBE_ZONE

# Check
gcloud container clusters list
kubectl get nodes
kubectl get services

Double check to make sure all the resources are now deleted, since anything you have not deleted will cost you money!
You can check the web console (make sure you are in the right project and account) to verify that everything has been deleted.

At a minimum, check the following under the Hamburger (left top corner) menu:

  • Compute Engine -> Disks
  • Container Engine -> Container Clusters
  • Container Registry -> Images
  • Networking -> Network Services -> Load Balancing

Extend jupyterhub

extending-jupyterhub

Mofify DNS A record (Name -> IP adresse)

You have bought "cooldomain.dk", and want the subdomain "hub.cooldomain.dk" to point to the google container cluster.

You need to setup an A-record.

At your domain registrar (for example https://web.gratisdns.dk )

Make an A record: A (Navn -> IP adresse)

  • Host: hub.cooldomain.dk
  • IP (ipv4): $EXTIP
  • TTL: 7200

Wait 24-48 hours.

Enable HTTPS

See security

By letsencrypt.

Save variables

echo "# The webpage hostname " >> 01_vars.sh
echo "H_HOST=hub.cooldomain.dk" >> 01_vars.sh
echo "# The webpage hostname " >> 01_vars.sh
echo "H_CONTACTMAIL=mymail@gmail.com" >> 01_vars.sh

source 01_vars.sh

First check current state of https

open https://$H_HOST

Write to config

echo "  https:" >> config.yaml
echo "    hosts:" >> config.yaml
echo "      - $H_HOST" >> config.yaml
echo "    letsencrypt:" >> config.yaml
echo "      contactEmail: $H_CONTACTMAIL" >> config.yaml
cat config.yaml

Run a helm upgrade:

# See revision
helm list
helm upgrade $H_RELEASE jupyterhub/jupyterhub --version=$H_VERSION --namespace=$G_KUBE_NAMESPACE -f config.yaml
# Revision should have changed
helm list

# Check pods
kubectl --namespace=$G_KUBE_NAMESPACE get pod

Wait for about a minute, now your hub should be HTTPS enabled!

open https://$H_HOST

Set the default target for the proxy, by specifying https.
The normal standards can be seen in "values.yaml",
downloaded from jupyterhub-v0.6.tgz https://jupyterhub.github.io/helm-chart/

Or more info here jupyterhub/configurable-http-proxy

We are changing from http to https.

echo "  chp:" >> config.yaml
echo "    cmd:" >> config.yaml
echo '      - configurable-http-proxy' >> config.yaml
echo '      - --ip=0.0.0.0' >> config.yaml
echo '      - --port=8000' >> config.yaml
echo '      - --api-ip=0.0.0.0' >> config.yaml
echo '      - --api-port=8001' >> config.yaml
echo '      - --default-target=https://$(HUB_SERVICE_HOST):$(HUB_SERVICE_PORT)' >> config.yaml
echo '      - --error-target=https://$(HUB_SERVICE_HOST):$(HUB_SERVICE_PORT)' >> config.yaml
echo '      - --log-level=debug' >> config.yaml

cat config.yaml

Run a helm upgrade:

# See revision
helm list
helm upgrade $H_RELEASE jupyterhub/jupyterhub --version=$H_VERSION --namespace=$G_KUBE_NAMESPACE -f config.yaml
# Revision should have changed
helm list

# Check pods
kubectl --namespace=$G_KUBE_NAMESPACE get pod

Try

open http://$H_HOST

Delete the Kubernetes Dashboard

See here: delete-the-kubernetes-dashboard

# First inspect
kubectl --namespace=kube-system get deployment
kubectl --namespace=kube-system get deployment kubernetes-dashboard
# Delete
kubectl --namespace=kube-system delete deployment kubernetes-dashboard
kubectl --namespace=kube-system get deployment

Authentication with bitbucket

See authentication

For bitbucket, see this oauth-on-bitbucket-cloud

At bitbucket

  • Visit https://bitbucket.org/account
  • Click OAuth
  • Click Add consumer
  • Name: Some random name
  • Description: Testing
  • Callback URL: https://${H_HOST}/hub/oauth_callback (replace $H_HOST with your domain)
  • Don't change the tick in "This is a private consumer"
  • Permissions: Just ask for: Account-->Email
  • Click save
  • Click the name, and record "Key"-->H_AUTHCLIENTID and "Secret"-->H_AUTHCLIENTSECRET

Save variables

echo "# The auth type " >> 01_vars.sh
echo "H_AUTHTYPE=bitbucket" >> 01_vars.sh
echo "# The auth clientId " >> 01_vars.sh
echo "H_AUTHCLIENTID_BIT=y0urg1thubc1ient1d" >> 01_vars.sh
echo "# The auth clientSecret " >> 01_vars.sh
echo "H_AUTHCLIENTSECRET_BIT=an0ther1ongs3cretstr1ng" >> 01_vars.sh

source 01_vars.sh

See

Write config

echo "" >> config.yaml
echo "hub:" >> config.yaml
echo "  extraEnv:" >> config.yaml
echo "    OAUTH2_AUTHORIZE_URL: https://bitbucket.org/site/oauth2/authorize" >> config.yaml
echo "    OAUTH2_TOKEN_URL: https://bitbucket.org/site/oauth2/access_token" >> config.yaml
echo "auth:" >> config.yaml
echo "  type: custom" >> config.yaml
echo "  custom:" >> config.yaml
echo "    className: oauthenticator.generic.GenericOAuthenticator" >> config.yaml
echo "    config:" >> config.yaml
echo "      client_id: '$H_AUTHCLIENTID_BIT'" >> config.yaml
echo "      client_secret: '$H_AUTHCLIENTSECRET_BIT'" >> config.yaml
echo "      token_url: https://bitbucket.org/site/oauth2/access_token" >> config.yaml
echo "      userdata_url: https://api.bitbucket.org/2.0/user" >> config.yaml
echo "      userdata_method: GET" >> config.yaml
echo "      userdata_params: {'state': 'state'}" >> config.yaml
echo "      username_key: username" >> config.yaml

cat config.yaml

Run a helm upgrade:

# See revision
helm list
helm upgrade $H_RELEASE jupyterhub/jupyterhub --version=$H_VERSION --namespace=$G_KUBE_NAMESPACE -f config.yaml
# Revision should have changed
helm list

# Check pods
kubectl --namespace=$G_KUBE_NAMESPACE get pod

Try

open http://$H_HOST

One should get prompted for login with bitbucket.

Afterwards a "500 : Internal Server Error" will display.

Let us check the logs

HUB=`kubectl --namespace=$G_KUBE_NAMESPACE get pod | grep "hub-" | cut -d " " -f1`
kubectl describe pods $HUB
kubectl logs $HUB

The logs show problems???

Authentication with github

We delete hub and auth, and make with github instead

See (github developers)[https://github.com/settings/developers]

Save variables

echo "# The auth type " >> 01_vars.sh
echo "H_AUTHTYPE=github" >> 01_vars.sh
echo "# The auth clientId " >> 01_vars.sh
echo "H_AUTHCLIENTID_GIT=y0urg1thubc1ient1d" >> 01_vars.sh
echo "# The auth clientSecret " >> 01_vars.sh
echo "H_AUTHCLIENTSECRET_GIT=an0ther1ongs3cretstr1ng" >> 01_vars.sh

source 01_vars.sh
echo "" >> config.yaml
echo "auth:" >> config.yaml
echo "  type: github" >> config.yaml
echo "  github:" >> config.yaml
echo "    clientId: '$H_AUTHCLIENTID_GIT'" >> config.yaml
echo "    clientSecret: '$H_AUTHCLIENTSECRET_GIT'" >> config.yaml
echo "    callbackUrl: 'https://${H_HOST}/hub/oauth_callback'" >> config.yaml

cat config.yaml

Run a helm upgrade:

# See revision
helm list
helm upgrade $H_RELEASE jupyterhub/jupyterhub --version=$H_VERSION --namespace=$G_KUBE_NAMESPACE -f config.yaml
# Revision should have changed
helm list

# Check pods
kubectl --namespace=$G_KUBE_NAMESPACE get pod

Try

open http://$H_HOST

This works :)

Whitelist users and add admin

See

Add or remove users from the Hub
Users can be added to and removed from the Hub via either the admin panel or the REST API.
When a user is added, the user will be automatically added to the whitelist and database.
Restarting the Hub will not require manually updating the whitelist in your config file, as the users will be loaded from the database.

After starting the Hub once, it is not sufficient to remove a user from the whitelist in your config file.
You must also remove the user from the Hub's database, either by deleting the user from JupyterHub's admin page,
or you can clear the jupyterhub.sqlite database and start fresh.

First declare a bash array with users to whitelist and admin.

echo "# The array of users to whitelist " >> 01_vars.sh
echo 'H_WHITE=(user1 user2 user3 user4)' >> 01_vars.sh
echo "# The array of users to admin " >> 01_vars.sh
echo 'H_ADMIN=(user1 user4)' >> 01_vars.sh
source 01_vars.sh

Try looping over the array in bash

for x in ${H_WHITE[@]}; do echo "User: $x"; done
for x in ${H_ADMIN[@]}; do echo "Admin: $x"; done

Then write to config

echo "" >> config.yaml
echo "  whitelist:" >> config.yaml
echo "    users:" >> config.yaml
for x in ${H_WHITE[@]}; do echo "      - $x" >> config.yaml; done

echo "  admin:" >> config.yaml
echo "    users:" >> config.yaml
for x in ${H_ADMIN[@]}; do echo "      - $x" >> config.yaml; done

cat config.yaml 

Run a helm upgrade:

# See revision
helm list
helm upgrade $H_RELEASE jupyterhub/jupyterhub --version=$H_VERSION --namespace=$G_KUBE_NAMESPACE -f config.yaml
# Revision should have changed
helm list

# Check pods
kubectl --namespace=$G_KUBE_NAMESPACE get pod

# Try
open http://$H_HOST

This works. Admin users can now go to "Control Panel" --> "Admin".
And here delete users.

Another method. First delete the section with whitelist and admin.
See

This works as well.

hub:
  extraConfig: |
    c.Authenticator.whitelist = {'mal', 'zoe', 'inara', 'kaylee'}
    c.Authenticator.admin_users = {'mal'}

Modify Docker image

See

Let us take "Jupyter Notebook Data Science Stack"

As of 2018-02-19, let us stick to the tag defined in the manual: c7fb6660d096

Please don't change the tag, before you have verified a first successful installation.

Save variables

echo "# The image name " >> 01_vars.sh
echo "H_IMAGENAME=jupyter/datascience-notebook" >> 01_vars.sh
echo "# The image tag" >> 01_vars.sh
echo "H_IMAGETAG=c7fb6660d096" >> 01_vars.sh

source 01_vars.sh

Write to config

echo "" >> config.yaml
echo "singleuser:" >> config.yaml
echo "  image:" >> config.yaml
echo "    name: $H_IMAGENAME" >> config.yaml
echo "    tag: $H_IMAGETAG" >> config.yaml
cat config.yaml

Note, this will take take several minutes due to prepuller.
See pre-pulling-basics

Note We add a timeout to the helm updrade, which is 5x as long as normal, 1500s=25 min.

Run a helm upgrade:

# See revision
helm list
helm upgrade $H_RELEASE jupyterhub/jupyterhub --timeout=1500 --version=$H_VERSION --namespace=$G_KUBE_NAMESPACE -f config.yaml
# Revision should have changed
helm list

# Check pods
kubectl --namespace=$G_KUBE_NAMESPACE get pod

# Try
open http://$H_HOST

You may need to logon, --> Control panel --> Stop My Server, and start new server.

Try make a new notebook and run pip freeze to see packages.

! pip freeze

Try to change kernel to Julia or R.

Wuhuu. This works.