Docker. Explained.

This Readme file provides the main key points about the what the Dockers are and how they can be used in MlOPS, DevOps, and in Data science.

What are Containers?

A group of processes run on isolation.

  • All processes must be able to run on the shared kernel.

Each container has its own set of namespaces (isolated view):

  • PID - process IDs.
  • USER - user and group IDs.
  • UTS - hostname and domain name.
  • NS - mount points.
  • NET - network devices, stacks, ports.
  • IPC - inter-process communications, message queues.
cgroups controls limits and monitoring of resources.

VM (Virtual Machine) vs. Containers

  • Each VM has its own OS.
  • VM is very heavy and slow to start.
  • Containers shares the same base Kernel.
  • Linux namespaces, not include the full OS.
  • Quick to start due its lightweight feature.
Containers do not replace VMs. It is not one over other. Containers can work on the top of a VM. Containers often re-couple the infrastructure of an existing VM.

What is Docker?

  • Dockers enables containers to be used by the masses. At its core, Docker is tooling to manage containers:
    1. Simplified existing technology to enable it for the masses.
  • Enable developers to use containers for their applications:
    1. Package dependencies with containers: build once, run anywhere.

Why Containers are appealing to Users?

  • No more Works on my machine.
  • Lightweight and fast.
  • Better resource utilization
    1. Can fit far more containers than VMs into a host. You can run many containers on the same infrastructure without conflicts and save a lot of money.
  • Standard developer to operations interface.
  • Ecosystem and tooling.

Useful links:

Hands-on practice

Run first container

Use the Docker CLI to run your first container.

  1. Open a terminal on your local computer and run this command: docker container run -t ubuntu top
    The docker run command first starts a docker pull to download the Ubuntu image onto your host. After it is downloaded, it will start the container.
    top is a Linux utility that prints the processes on a system and orders them by resource consumption. Notice that there is only a single process in this output: it is the top process itself. You don't see other processes from the host in this list because of the PID namespace isolation.

  2. Containers use Linux namespaces to provide isolation of system resources from other containers or the host. The PID namespace provides isolation for process IDs. If you run top while inside the container, you will notice that it shows the processes within the PID namespace of the container, which is much different than what you can see if you ran top on the host.
    Even though we are using the Ubuntu image, it is important to note that the container does not have its own kernel. It uses the kernel of the host and the Ubuntu image is used only to provide the file system and tools available on an Ubuntu system.
  3. Inspect the container: docker container exec
    This command allows you to enter a running container's namespaces with a new process.
  4. Open a new terminal. To open a new terminal connected to node1 by using Play-with-Docker, click Add New Instance on the left and then ssh from node2 into node1 by using the IP that is listed by node1.
  5. In the new terminal, get the ID of the running container that you just created: docker container ls
  6. Use that container ID to run bash inside that container by using the docker container exec command. Because you are using bash and want to interact with this container from your terminal, use the -it flag to run using interactive mode while allocating a psuedo-terminal:
    $ docker container exec -it b3ad2a23fab3 bash
    root@b3ad2a23fab3:/#
    You just used the docker container exec command to enter the container's namespaces with the bash process. Using docker container exec with bash is a common way to inspect a Docker container.

    Notice the change in the prefix of your terminal, for example, root@b3ad2a23fab3:/. This is an indication that you are running bash inside the container.

    Tip: This is not the same as using ssh to a separate host or a VM. You don't need an ssh server to connect with a bash process. Remember that containers use kernel-level features to achieve isolation and that containers run on top of the kernel. Your container is just a group of processes running in isolation on the same host, and you can use the command docker container exec to enter that isolation with the bash process. After you run the command docker container exec, the group of processes running in isolation (in other words, the container) includes top and bash.

  7. From the same terminal, inspect the running processes: ps -ef.
    You should see only the top process, bash process, and your ps process.
  8. For comparison, exit the container and run ps -ef or top on the host. These commands will work on Linux or Mac. For Windows, you can inspect the running processes by using tasklist.
    root@b3ad2a23fab3:/# exit
    exit
    $ ps -ef
    # Lots of processes!

    PID is just one of the Linux namespaces that provides containers with isolation to system resources. Other Linux namespaces include:

    • MNT: Mount and unmount directories without affecting other namespaces.
    • NET: Containers have their own network stack.
    • IPC: Isolated interprocess communication mechanisms such as message queues.
    • User: Isolated view of users on the system.
    • UTC: Set hostname and domain name per container.

    These namespaces provide the isolation for containers that allow them to run together securely and without conflict with other containers running on the same system.

  9. Clean up the container running the top processes: < ctrl>-c

Docker Images

What is the Docker Image?

  • TAR file containing a container's filesystem and metadata.

The reason why we create Docker Images is?

  • For sharing and re-distribution. Many containers can be created from a single Image. Images can be downloaded from the Docker Hub and then create a Docker container.
  • To share Images we can use Docker Registry.
    • Push and Pull Images from Registry
    • Default Registry: Docker Hub:
      • Public and free for public images
      • Many pre-packaged images available.
    • Private Registry:
      • Self-host or cloud prover options.

Creating a Docker Image - with Docker build

    Create a Dockerfile:
    • List of instructions for how to construct the container.
    • docker build -f Dockerfile

Example of Ubuntu Image.
$ cat Dockerfile
FROM ubuntu
ADD myapp /
EXPOSE 80
ENTRYPOINT /myapp

This example code above correspond to each Image Layer in the Docker Image.

Secret Sauce: Docker Image layers

This R/W layer Container Level
Image Layer 4 Image Layers (R/O)
Image Layer 3 Image Layers (R/O)
Image Layer 2 Image Layers (R/O)
Image Layer 1 Image Layers (R/O)
Ubuntu 15.04 Image Layers (R/O)

Each layer is built on previous layer before. If you change only the last layer of the Dockerfile, Docker the Docker engine will re-use the first three layers from cache and only re-built the the last layer.

  • Image layers are read only.
  • Re-use Image layers across multiple instances of the same container.

Docker Containers Layers

  • Union File System
    • Merge image layers into single file system for each container.
  • Copy-on-Write
    • Copies files that are edited up to top writable layer.
    • Keeping image layers in read only mode. That is why you can re-use all these layers across images and containers.
  • Advantages:
    • More containers per host (save money on infrastructure).
    • Faster start-up/download time - base layers are cached.
Docker Image layers

This is the update. You always update only the last layer of Docker(file). The very first line so tells you Pushed. All other layes are cached and already in Registry

Docker Containers Layers

Container Orchestration

What is Container Orchestration?

  • Cluster Management
  • Scheduling
  • Service Discovery
  • Replication
  • Health Management
  • Declare Desired States
    • Active Reconcilation

Container Ecosystem Layers

Layer # Purpose Examples
Layer 6 Development Workflow Opinionated Containers OpenShift, <DEIS/i>
Layer 5 Orchestration/Scheduling Service Model Kubernetes, Marathon
Layer 4 Container Engine) Docker, Rocket
Layer 3 Operating System Ubuntu, Redhat
Layer 2 Virtual Infrastructure VmWare, AWS EC2
Layer 1 Physical Infrastructure Raw Compute, Network, Storage

Hosted Solutions:

  • IBM Cloud Container Service
  • Amazon ECS
  • Azure Containers
  • Google Containers Engine