/gke-migration-to-containers

This demo provides a basic walkthrough of migrating a stateless application from running on a VM all the way to running it on Kubernetes Engine (GKE).

Primary LanguageShellApache License 2.0Apache-2.0

Migrating to Containers

Containers are quickly becoming an industry standard for deployment of software applications. The business and technological advantages of containerizing workloads are driving many teams towards moving their applications to containers. This demo provides a basic walkthrough of migrating a stateless application from running on a VM to running on Kubernetes Engine (GKE). It demonstrates the lifecycle of an application transitioning from a typical VM/OS-based deployment to three different containerized cloud infrastructure platforms.

Table of Contents

Introduction

There are numerous advantages to using containers to deploy applications. Among these are:

  1. Isolated - Applications have their own libraries; no conflicts will arise from different libraries in other applications.

  2. Limited (limits on CPU/memory) - Applications may not hog resources from other applications.

  3. Portable - The container contains everything it needs and is not tied to an OS or Cloud provider.

  4. Lightweight - The kernel is shared, making it much smaller and faster than a full OS image.

This project demonstrates migrating a simple Python application named Prime-flask to:

  1. A legacy deployment (Debian VM) where Prime-flask is deployed as the only application, much like a traditional application is run in an on-premises datacenter

  2. A containerized version deployed on Container-Optimized OS (COS)

  3. A Kubernetes deployment where Prime-flask is exposed via a load balancer and deployed in Kubernetes Engine

After the deployment you'll run a load test against the final deployment and scale it to accommodate the load.

Architecture

Configuration 1: screenshot

Configuration 2: screenshot

Configuration 3: screenshot

A simple Python Flask web application (Prime-flask) was created for this demonstration which contains two endpoints:

http://<ip>:8080/factorial/ and

http://<ip>:8080/prime/

Examples of use would look like:

curl http://35.227.149.80:8080/prime/10
The sum of all primes less than 10 is 17

curl http://35.227.149.80:8080/factorial/10
The factorial of 10 is 3628800

Also included is a utility to validate a successful deployment.

Prerequisites

Run Demo in a Google Cloud Shell

Click the button below to run the demo in a Google Cloud Shell.

Open in Cloud Shell

All the tools for the demo are installed. When using Cloud Shell execute the following command in order to setup gcloud cli. When executing this command please setup your region and zone.

gcloud init

Get The Code

Tools

In order to use the code in this demo you will need access to the following tools:

Install Cloud SDK

The Google Cloud SDK is used to interact with your GCP resources. Installation instructions for multiple platforms are available online.

Install kubectl CLI

The kubectl CLI is used to interteract with both Kubernetes Engine and kubernetes in general. Installation instructions for multiple platforms are available online.

Install Terraform

Terraform is used to automate the manipulation of cloud infrastructure. Its installation instructions are also available online.

Authenticate gcloud

Prior to running this demo, ensure you have authenticated your gcloud client by running the following command:

gcloud auth application-default login

Deployment

The infrastructure required by this project can be deployed by executing:

make create

This will:

  1. Package the deployable Prime-flask application.
  2. Create the container image and push it to the private Container Registry (GCR) for your project.
  3. Generate an appropriate configuration for Terraform.
  4. Execute Terraform which creates the three deployments.

screenshot

screenshot

screenshot

Validation

Now that the application is deployed, we can validate these three deployments by executing:

make validate

A successful output will look like this:

Validating Debian VM Webapp...
Testing endpoint http://35.227.149.80:8080
Endpoint http://35.227.149.80:8080 is responding.
**** http://35.227.149.80:8080/prime/10
The sum of all primes less than 10 is 17
The factorial of 10 is 3628800

Validating Container OS Webapp...
Testing endpoint http://35.230.123.231:8080
Endpoint http://35.230.123.231:8080 is responding.
**** http://35.230.123.231:8080/prime/10
The sum of all primes less than 10 is 17
The factorial of 10 is 3628800

Validating Kubernetes Webapp...
Testing endpoint http://35.190.89.136
Endpoint http://35.190.89.136 is responding.
**** http://35.190.89.136/prime/10
The sum of all primes less than 10 is 17
The factorial of 10 is 3628800

Of course, the IP addresses will likely differ for your deployment.

Load Testing

In a new console window, execute the following, replacing [IP_ADDRESS] with the IP address and port from your validation output from the previous step. Note that the Kubernetes deployment runs on port 80, while the other two deployments run on port 8080:

ab -c 120 -t 60  http://<IP_ADDRESS>/prime/10000

ApacheBench (ab) will execute 120 concurrent requests against the provided endpoint for 1 minute. The demo application's replica is insufficiently sized to handle this volume of requests.

This can be confirmed by reviewing the output from the ab command. A Failed requests value of greater than 0 means that the server couldn't respond successfully to this load:

screenshot

One way to ensure that your system has capacity to handle this type of traffic is by scaling up. In this case, we would want to scale our service horizontally.

In our Debian and COS architectures, horizontal scaling would include:

  1. Creating a load balancer.
  2. Spinning up additional instances.
  3. Registering them with the load balancer.

This is an involved process and is out of scope for this demonstration.

For the third (Kubernetes) deployment the process is far easier:

kubectl scale --replicas 3 deployment/prime-server

After allowing 30 seconds for the replicas to initialize, re-run the load test:

ab -c 120 -t 60  http://<IP_ADDRESS>/prime/10000

Notice how the Failed requests is now 0. This means that all of the 10,000+ requests were successfully answered by the server:

screenshot

Tear Down

When you are finished with this example you will want to clean up the resources that were created so that you avoid accruing charges:

$ make teardown

It will run terraform destroy which will destroy all of the resources created for this demonstration.

screenshot

More Info

For additional information see: Embarking on a Journey Towards Containerizing Your Workloads

Troubleshooting

Occasionally the APIs take a few moments to complete. Running make validate immediately could potentially appear to fail, but in fact the instances haven't finished initializing. Waiting for a minute or two should resolve the issue.

The setup of this demo does take up to 15 minutes. If there is no error the best thing to do is keep waiting. The execution of make create should not be interrupted.

If you do get an error, it probably makes sense to re-execute the failing script. Occasionally there are network connectivity issues, and retrying will likely work the subsequent time.

This is not an officially supported Google product