Kubernetes is a powerful suite of tools to orchestrate containerized apps. Let's dive in with a small starter project that will familiarize us with some important Kubernetes resources, and get a feel for what a typical workflow will be.
You will start with a small system made up of 3 components:
- an in-memory queue
- a publisher
- a subscriber
Following along with the instructions below, you will deploy all 3 of these components to a local Kubernetes cluster, and then can go further by implementing some stretch goals at the end of this document.
NOTE: this workshop complements the presentations from the LiveRamp Presents: Totally Getting Containers meetup. Hopefully I will be able to post the recording of the talks as well, but for now, here are the slides: Totally Getting Kubernetes Totally Getting Docker
This workshop assumes minimal knowledge of Kubernetes. Even if you're brand new to Kube, some light googling throughout this project should be sufficient. The only 3 resources that we will encounter are Pods, Deployments, and Services.
You should have a cluster running locally. Docker for Mac is a good choice. After downloading it and enabling Kubernetes, run
$ kubectl version
to ensure setup is complete.
It's also recommended to have an easy way to send HTTP requests. This workshop assumes HTTPie throughout the examples, but something like Postman will work just as well.
NOTE: HTTPie syntax allows for localhost shorthand: http :3000
is equivalent to http http://localhost:3000
. This workshop uses that shorthand syntax frequently!
This repo comes with some boilerplate code that comprises an un-containerized app. Each component is a small HTTP server written in Javascript. Feel free to translate this code into a language you are more familiar with, but note that you will only need to make very minimal changes to this code throughout the workshop.
Try to complete each step on your own, in order. There are git branches that you can check out to view solutions, in case you want to double check your work or if you get stuck.
You will see the 3 main directories: publisher
, queue
, and subscriber
. Take some time to read through the servers found in the main.js
files within each directory. When you are comfortable with the code, open a new terminal tab for each directory and run
$ node main.js
With the 3 servers running locally, POST
a message to http://localhost:3000
:
$ http POST :3000 name=Mac position='Sheriff of Paddys'
and then navigate to http://localhost:3002
to read it, either from the command line or in the browser:
$ http :3002
HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 38
Date: Sun, 14 Oct 2018 23:12:23 GMT
{"name":"Mac","position":"Sheriff of Paddys"}
You can POST
however many messages you want via the publisher, and then use the subscriber to read them LIFO style. For debugging purposes, you can also see the current contents of the queue at any time by navigating to http://localhost:3001/queue
.
$ http :3001/queue
Once you feel comfortable with how the app works, proceed to step 2.
The first step to deploying an app to Kubernetes is to build it into an image for Kubernetes pods to run. We are going to create docker images for publisher, queue, and subscriber, by creating a Dockerfile for each app, and naming the images workshop/publisher
, workshop/queue
, and workshop/subscriber
. Try this on your own, but for some more guidance, feel free to follow along with the steps below.
Task:
- Create a file called
Dockerfile
at the root of each of the 3 components' directories. - In the Dockerfiles, use
node
as a base image, copy in the necessary files, runnpm install
when necessary, and have the container run themain.js
file with node. You will use the directives: FROM, COPY, RUN, and CMD. If you haven't written a Dockerfile before, then this guide will be really helpful. - Use a
.dockerignore
file to ignore the node_modules/ directory. This keeps the build context small. - Build the images e.g:
$ docker build ./publisher -t workshop/publisher
You can list your images to verify successful creation:
$ docker image ls
To see one possible solution, check out the branch solution-images
$ git checkout solution-images
or view this commit
Let's try to run our publisher app in a Kubernetes pod by creating a deployment named publisher
.
$ kubectl run publisher --image=workshop/publisher
deployment.apps "publisher" created
Seems like it worked! But let's check the pods . . .
$ kubectl get pod
NAME READY STATUS RESTARTS AGE
publisher-5f577cbbdc-v9xxb 0/1 ErrImagePull 0 8s
The status of ErrImagePull
indicates that something went wrong. To see the events leading up to the error, use the describe
command:
$ kubectl describe pod <YOUR_POD_NAME>
At the bottom of the output, you will see a list of events. One will be something like Failed to pull image "workshop/publisher": rpc error: code = Unknown desc = Error response from daemon: pull access denied for workshop/publisher, repository does not exist or may require 'docker login'
Kubernetes thinks that we want to pull our workshop/publisher
image from a remote image repository, like docker hub. We need to indicate to Kubernetes that we do not want it to try pulling this image.
$ kubectl run publisher --image=workshop/publisher --image-pull-policy=Never
Error from server (AlreadyExists): deployments.extensions "publisher" already exists
Here is our second error: kubectl will return an error when trying to create a resource that already exists. Let's fix this:
$ kubectl delete deployment publisher
deployment.extensions "publisher" deleted
$ kubectl run publisher --image=workshop/publisher --image-pull-policy=Never
deployment.apps "publisher" created
Verify that it all worked correctly:
$ kubectl get pod
NAME READY STATUS RESTARTS AGE
publisher-668597fcc6-kw777 1/1 Running 0 1m
Success! Go ahead and run kubectl delete pod <POD_NAME>
, and watch as the pod terminates while a new one immediately replaces it! Now that our deployment is in place, we can delete this pod over and over, and a new one (with a slightly different name) will constantly pop up to take its place. Feel the power!
Feel free to describe
this pod as well and take a look at the Events. That's what a successful pod looks like. Also note: you'll be creating and deleting Kubernetes resources all the time. Nothing is sacred.
Deploy the other components (queue, subscriber) to kubernetes the same way that we ran the publisher. After the deployments are created, check the statuses of the pods and make sure they are Running
. If something goes wrong, like CrashLoopBackOff
, it might be a problem with your Dockerfile. Make use of
$ kubectl logs <POD_NAME>
to get error messages from your container, and troubleshoot.
Again, you can list pods with
$ kubectl get pod
and pick any random pod from the list. Then run
$ kubectl delete pod <POD_NAME>
You'll see that the selected pod will terminate, but another one (with a slightly different name) will immediately pop up to replace it. You can delete pods all day, and new ones will constantly spin up to replace them.
For fun, you can open two terminals, and in one, run kubectl get pod -w
(-w for "watch"), and in another terminal, start deleting pods. You will see a feed in the first terminal of pods terminating, initializing, running, and so on.
At this point, all 3 of our components are running in Kubernetes. But we can't actually hit any of our services since none of the pods are exposed to the outside world. In fact, even if we could hit, say, our publisher service, it wouldn't do us any good. The main.js
file in publisher will attempt to enqueue a message at http://localhost:3001
, which works great when we run all of our services on our laptop, but breaks when run in isolated pods in a Kubernetes cluster (where each pod has its own separate notion of localhost).
If you are familiar with 12 factor app philosophy, you'll know that it is best practice for an app to get its configuration from its environment. We no longer want our apps to hardcode localhost
endpoints -- instead, we want to pass these endpoints as environment variables, so the code can remain more flexible and robust. Kubernetes makes this very easy for us.
- Use the builtin
env
command to list the environment variables on the publisher pod:
$ kubectl exec <PUBLISHER_POD_NAME> env
- Next, expose the queue deployment to the rest of the cluster. Kubernetes allows pods to communicate with each other via a
ClusterIP
service, which is a cluster-wide IP address that any pod can reach. There are multiple types of services (we'll seeNodePort
services in the next step), but thekubectl expose
command defaults to creating aClusterIP
service.
$ kubectl expose deployment queue --port=3001
kubectl get svc
will show that this new service has been created successfully.
-
Now, delete the current
publisher
pod so a new one spins up. When it is ready, check its environment variables. Do you notice the new values with which Kubernetes automatically populated the publisher's environment? Delete thesubscriber
pod and check its environment variables as well, to see that they get populated into each pod in the cluster. From now on, every pod in this cluster can expect to have these predictively formatted (if not a little verbose . . .) environment variables populated by Kubernetes. -
Update the code in both publisher and subscriber to retrieve the queue's endpoint from the environment, as opposed to hardcoding it, using
QUEUE_SERVICE_HOST
andQUEUE_SERVICE_PORT
variables. In javascript, you can access environment variables viaprocess.env
. -
Build the docker images with the updated code, create the deployments (don't forget to delete the old ones first!), and ensure that all pods are running successfully.
To see one possible solution, check out the branch solution-internal-service
$ git checkout solution-internal-service
or view this commit
Now, our publisher and subscriber are able to discover the queue. But we still don't have a way of accessing publisher and subscriber themselves from outside of our cluster e.g. from our laptop's localhost. This will be necessary if we want to enqueue/dequeue messages without exec
ing onto our pods each time.
To expose an app externally, we must create a NodePort service.
$ kubectl expose deployment publisher --type=NodePort --port=3000
This command causes Kubernetes to automatically allocate a random port on our machine, and then route all requests from that port on our laptop, to the publisher
's pods at the --port
we provide.
$ kubectl get svc publisher
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
publisher NodePort 10.103.107.240 <none> 3000:30233/TCP 2m
You will see that the publisher service automatically created a CLUSTER-IP, and that it connects ports 3000:30233
, the right hand number having been randomly assigned by Kubernetes. This means that all requests to our laptop's http://localhost:30233
will now be routed to the publisher app's port 3000. Run the following command, substituting the randomly assigned port you see mapped in the publisher service.
$ http POST :30233 name=Charlie position=Janitor
and ensure that it is successful.
Now go ahead and expose the subscriber with a NodePort service as well. Attempt to read a message.
If you successfully retrieved a message from a queue, that means you've officially deployed a distributed system to Kubernetes. Congratulations!
These stretch goals are just suggestions. If you think of anything cool to build out this system, feel free to jam on whatever interests you!
-
Try writing the Kubernetes resources by hand, instead of generating them from kubectl. This is what you would do in a production environment.
-
Make each component of our app listen on port 3000, instead of 3001 and 3002. You will need to update the existing Kube services to map the correct ports. Note how Kubernetes/containers services make it so we don't have to worry about allocating different localhost ports for our apps anymore!
-
Write a script called
bin/build-and-deploy
for each component that will build the docker image and deploy to our local kubernetes cluster. Make sure to put it in.dockerignore
. Then write another that will build and deploy the entire 3-part system. -
Create a dedicated Kubernetes namespace for these apps, and move the deployments + services to that namespace
-
Create a Kubernetes secret, and have the Queue verify that secret is present in every request before it alters the queue
-
Make the queue a bit more persistent by flushing it to a mounted Volume.
-
Use a real message queue (e.g RabbitMQ) instead of our in-memory queue.
-
Anything else that catches your fancy!