/docker-2024

Primary LanguageDockerfile

Docker 2024

Timm Heuss

April 2024

What's Docker


Virtual Machine

block-beta
columns 2
app1["My App 1"] app2["My App 2"]
d1["Dependencies 1"] d2["Dependencies 2"]
ul1["Guest Userland"] ul2["Guest Userland"]
k1["Guest Kernel"] k2["Guest Kernel"]
i1["Guest Infrastructure"] i2["Guest Infrastructure"]
vm["Virtual Machine"]:2
k["Host Kernel"]:2
i["Host Infrastructure"]:2
Loading
  • Virtualised hardware
  • OS kernel and drivers

Docker

block-beta
columns 2
app1["My App 1"] app2["My App 2"]
d1["Dependencies 1"] d2["Dependencies 2"]
block:b:2
  Base["Base image"]
  space
end
Docker["Docker Engine"]:2
k["Host Kernel"]:2
i["Host Infrastructure"]:2
Loading
  • Containers are not running an OS kernel
  • Layered, caching file system

Docker Vocabulary

flowchart LR
    Dockerfile --> build --> image["Docker Image"] --> run --> container["Docker Container"]
   image --> push --> registry[Docker Registry]
   registry --> pull --> image
   compose[compose.yml] --> services
   services --> service
   compose --> image
   service --> image
   service --> up --> container
   
Loading

Golden Rules

  • Reduce the size of our shipment, because
    • less storage needed
    • better transferring times
    • lower the attack surface
  • Reduce build time
    • only fast builds will be executed frequently
  • Make builds reproducible
    • nobody has time to debug

Docker

Building and running images


Typical file names

Dockerfile

FYI: There are things you just have to know ๐Ÿคจ


Base image defines the package manager

FROM ubuntu # -> apt
FROM alpine # -> apk
FROM scratch # -> no package manager 
FROM python # -> You cannot tell without looking into it

Some images use tags to indicate their base image


FROM python:3.13-rc-alpine3.18 # -> alpine -> apk
FROM python:3.13-rc-bookworm # -> ???
  • "bookworm" is a Debian version
  • Debian uses apt
FROM python:3.13-rc-bookworm # -> apt

Installing properly can be tricky

RUN apt update && apt install -y git
  • no pinned version for git
  • keeps the apt cache in the layer
  • potentially installs unwanted packages

๐Ÿ‘Ž


Installing properly with apt

RUN   apt-get update && \
      apt-get -y install --no-install-recommends \
      git=1.2.25.1 \
      && rm -rf /var/lib/apt/lists/*
  • pinned version for git
  • removes the cache within the same layer before commit
  • does not install additional software

๐Ÿ‘


Hands on

๐Ÿ™Œ


Lets start with this Dockerfile

FROM ubuntu

RUN apt-get update && apt-get install -y git apache2

WORKDIR /var/www/html/

RUN  git clone https://github.com/oscarmorrison/md-page 
RUN  git clone https://github.com/artslob/fallout-svg

COPY . ./

RUN  echo '<script src="md-page/md-page.js"></script><noscript>' > 'index.html' && \
   cat "readme.md" >> 'index.html'

EXPOSE 80

CMD ["apache2ctl", "-D", "FOREGROUND"]

Basic CLI commands

docker build . -t hello
docker images
docker run --publish 80:80 hello
docker ps
docker stop / kill

Dive into it

dive command on simple Dockerfile example

251MB image

๐Ÿค” apt caches, git repos, tons of binaries


COPY . .

COPY . ./

Invalidates cache on every change in the folder.

๐Ÿ‘Ž


COPY readme.md ./

Invalidates cache only when readme.md is changed.

๐Ÿ‘


Introduce Multi-Stage

FROM ubuntu as base

FROM base as build
[...]

FROM base as runtime
COPY --from=build /workdir/md-page/md-page.js ./md-page/md-page.js
COPY --from=build /workdir/fallout-svg/vault-boy.svg ./fallout-svg/vault-boy.svg
COPY --from=build /workdir/index.html ./
[...]

Distinguish between build and runtime dependencies


We can specify the build target

FROM ubuntu as base

FROM base as build
[...]

FROM base as runtime
[...]
docker build . # builds until target "runtime"
docker build . --target build # builds until target "build"

Be clever with base images

FROM base as build

RUN apt-get update && apt-get install -y git

Manual installation, waste of time ๐Ÿ‘Ž


from alpine/git as build

Wit the right base image we download an installed git ๐Ÿ‘


Be even cleverer with base images

FROM base as build

RUN apt-get update && apt-get install -y apache2

โฌ‡๏ธ Better โฌ‡๏ธ

from httpd

WORKDIR /usr/local/apache2/htdocs/

Go crazy with base images

FROM alpine/git AS git1
[...]

FROM alpine/git AS git2
[...]

FROM ubuntu as build
[...]

FROM httpd
[...]

Same funcationality

hello   latest    b7733e03e9ac   10 minutes ago     178MB

178 MB (-30%)


Do we really need apache?

FROM joseluisq/static-web-server

WORKDIR /public

Optimize Stages and Base Images

scenario image size (MB) build time (s)
starting point 258 170
multi-stage, same base images 178 (-30%) 2 (-98%)
multi-stage, more suitable base images 8 (-97%) 1 (-99%)

reduced build times, reduced waiting times, less resource consumption, lower transferring times, reduced attack surface, ...


Reproducibility - Base Images

FROM httpd:2.4.59 AS httpd
FROM alpine/git:2.43.0 AS git

Reproducibility - Git

WORKDIR /workdir/md-page
RUN     git clone https://github.com/oscarmorrison/md-page . && \
  git reset 36eef73bbbd35124269f5a8fea3b5117cd7a91a3
WORKDIR /workdir/fallout-svg
RUN  git clone https://github.com/artslob/fallout-svg . && \
  git reset d1dad0950073bdef8cac463f8a87246f45af0ca0

Multi-Arch - Why do we care?

Reminder: We're not simulating hardware in containers


Let Docker decide the arch

docker pull nxginx

Docker pulls nxginx with the right CPU architecture for the host, with a fallback to amd64


Overriding the architecture

FROM --platform=linux/amd64 node:18-slim

This is useful e.g. for Playwright which has runtime dependencies that are not available for all platforms.

Docker Compose

Orchestrating multi-container deployments


Typical file names

compose.yaml (preferred)
compose.yml
docker-compose.override.yml
docker-compose.override.yaml
docker-compose.yaml
docker-compose.yml

Compose our previous example

services:
  service:
    build: .
    ports:
      - "80:80"

Example with two services

version: 3.2
services:
  app:
    image: 'docker-spring-boot-postgres:latest'
    build: .
    depends_on:
      - db
    environment:
      - SPRING_DATASOURCE_URL=jdbc:postgresql://db:5432/compose-postgres
      - SPRING_DATASOURCE_USERNAME=compose-postgres
      - SPRING_DATASOURCE_PASSWORD=compose-postgres
  db:
    image: 'postgres:13.1-alpine'
    environment:
      - POSTGRES_PASSWORD=compose-postgres

Docker Compose in a nutshell

  • Essential for multiple services
    • easy execution
    • easy configuration
    • the right abstraction for developers to describe their application

BTW: Compose is now built-in

old

docker-compose

new ๐Ÿ™Œ

docker compose

BTW: You don't need version

version: 3.2 # <- optional
services:
  app:
    [...]

The [version property] is [...] for backward compatibility. It is only informative.

https://github.com/compose-spec/compose-spec/blob/master/04-version-and-name.md


Scaling made easy

services:
  service:
    deploy:
      replicas: 5

More on that in my talk: Polyglot | scalable | observable news analysis


You can use YAML fragments ๐Ÿ˜

x-build: &build
  x-bake:
    platforms:
      - linux/amd64
      - linux/arm64

services:
  keyword-matcher-go:
    image: ghcr.io/heussd/nats-news-analysis/keyword-matcher-go:latest
    build:
      <<: *build
      context: keyword-matcher-go/.

Bind mounts

services:
  service:
    build: .
    ports:
      - "80:80"
    volumes:
      - ./folder:/usr/local/apache2/htdocs/folder/

Make a folder on your host system available inside the container


Volumes

services:
  service:
    build: .
    ports:
      - "80:80"
    volumes:
      - cache:/usr/local/[...]
volumes:
  cache:

Persist a folder across container executions


RUN-specific mounts


COPY go.mod go.sum .
RUN go mod download -x
  • COPY just for the package manager
  • Package manager cache is not be persisted

๐Ÿ‘Ž


RUN --mount=type=cache,target=/go/pkg/mod/ \
    --mount=type=bind,source=go.sum,target=go.sum \
    --mount=type=bind,source=go.mod,target=go.mod \
    go mod download -x
  • No COPY just for the package manager
  • Package manager cache is persisted across builds

๐Ÿ‘

Docker bake

Expert Docker building


Typical file names

docker-bake.json
docker-bake.override.json 
docker-bake.hcl (preferred)
docker-bake.override.hcl

Example

variable "HOME" {
  default = null
}

group "default" {
  targets = ["all"]
}

target "all" {
  platforms = [
    "linux/amd64", \
    "linux/ppc64le"
  ],
  labels = {
      "org.opencontainers.image.source" = "https://github.com/username/myapp"
      "com.docker.image.source.entrypoint" = "Dockerfile"
  }
}
docker buildx bake # build multiple images with all labels

Secrets

... or just use environment variables, I'm not your boss.


Secrets during build: secret mounts

docker build \
  --secret id=mytoken,src=$HOME/.aws/credentials \
  .

access inside container:

TOKEN=$(cat /run/secrets/mytoken)

bake is more convenient here

variable "HOME" {
  default = null
}

target "default" {
  secret = [
    "id=mytoken,src=${HOME}/.aws/credentials"
  ]
}

access inside container:

TOKEN=$(cat /run/secrets/mytoken)

Secrets during runtime

services:
  service:
    secrets:
      - mytoken
secrets:
  mytoken:
    file: ./my_secret.txt

access inside container:

TOKEN=$(cat /run/secrets/mytoken)

People seem to prefer environment variables for secrets

Docker Pitfalls


Meanings of latest

  • latest is just a tag, no automation
  • latest has no common meaning on Docker Hub
docker pull ubuntu:latest # <- Pulls latest stable LTS
docker pull swaggerapi/swagger-ui:latest # <- Pulls latest nightly

Different Runs

RUN in Dockerfile

FROM ubuntu
RUN whalesay "OMG" # runs code during image build

run on CLI

docker run ubuntu # runs a container

ENTRYPOINT vs. CMD vs. RUN

Dockerfile statement build phase run phase purpose
RUN โœ… Execute an command, commit result into image
ENTRYPOINT โœ… Specify what to do when container is executed
CMD โœ… Augment ENTRYPOINT with addition parameters

Arguments vs. Commands

  • ARGS are environment variables during the build phase.
  • Runtime arguments can be specified using CMD (Dockerfile) or commands (docker-compose).
  • Docker commands are command line parameters to the Docker binary (such as docker images).

Multi stage targets

FROM git as build # <- This is referenced as "target",
                              # not as a "stage"

WORKDIR /workdir

COPY readme.md ./
docker build . --target build

Docker build vs. Docker compose build

Respected during image build

ENV MSG="Hello world"
RUN echo $MSG

Not respected during image build:

service:
    environment:
        - MSG="Hello World"

ARG vs. ENV vs. .env vs. env_file

Hierarchy and Scope of Variables in Docker


The many mounts

... Mount Purpose Compose Docker CLI
Bind mount Access external files / folders volumes docker run -v
Bind mount ... during run RUN --mount=type=bind
Volume mount Access internal volumes volumes docker run -v
Secret mount Access secrets secrets docker secret / docker build --secrets
Cache mount Cache some paths for runs RUN --mount=type=cache

Devcontainers


Pre-Devcontainer Development

block-beta
columns 4

space space space app
space space space ide["IDE"]
space space space dev["Dev tools"] 
db["Additional containers"] space space interpreter["Code Interpreter"] 

Docker["Docker Engine"]:2 space Win["Host OS / Applications"]
k["Host Kernel"]:4
i["Host Infrastructure"]:4

style ide fill:red
style dev fill:red
style interpreter fill:red
Loading

Reproducibility challenges with dev tools and code interpreter.


Devcontainer Development

block-beta
columns 4

space app space space
space dev["Dev tools"] space space
db["Additional containers"] interpreter["Code Interpreter"] space IDE

Docker["Docker Engine"]:2 space Win["Host OS / Applications"]
k["Host Kernel"]:4
i["Host Infrastructure"]:4

IDE --> interpreter

style dev fill:green
style interpreter fill:green
Loading

Machine-readable reproducibility for dev tools and code interpreter.


Cloud-based Devcontainer Development

block-beta
columns 5

space app space:3
space dev["Dev tools"] space:3
db["Additional containers"] interpreter["Code Interpreter"] cs["Codespaces / Coder"] space Browser

Docker["Docker Engine"]:2 space:2 Win["Host OS / Applications"]
Cloud:3 space k["Host Kernel"]

style dev fill:green
style interpreter fill:green
style cs fill:blue
style Browser fill:blue

Browser --> cs
Loading

A minimal devcontainer defines one of the following

  • prebuilt base image such as mcr.microsoft.com/devcontainers/base:ubuntu
  • local Dockerfile
  • local compose.yml

These cannot be mixed, so compose.yml it is if the other's dont work.


Dockerfile compose.yml devcontainer.json
Runtime environment Base image Base image, Dockerfile Base image, Dockerfile, compose.yml
Support services / databases services
Build dependencies Install in layer
Additional dev / convenience tooling Features
IDE settings and plugins for supported IDEs

devcontainer.json

{
  "build": {
      "dockerfile": "../Dockerfile",
      "target": "dev"
  },
  "features": {
    "ghcr.io/shinepukur/devcontainer-features/vale:1": {},
    "ghcr.io/devcontainers/features/git:1": {},
  },
  "customizations": {
    "vscode": {
      "settings": {
        "dotfiles.repository": "https://github.com/heussd/dotfiles",
        "dotfiles.targetPath": "~/.dotfiles",
        "dotfiles.installCommand": ".install.sh",
      },
      "extensions": [
        "streetsidesoftware.code-spell-checker",
        "streetsidesoftware.code-spell-checker-german",
        "DavidAnson.vscode-markdownlint",
        "bierner.markdown-mermaid"
      ]
    }
  },
  "postStartCommand": "node /app/bin/reveal-md.js . --watch"
}

  1. build: Base image / Dockerfile / compose.yml
  2. features: Additional development-only tooling
  3. settings: IDE settings, dotfiles
  4. extensions: IDE addons
  5. postStartCommand: What to do after devcontainer has started

Useful Tools


Dive

dive

Exploring a docker image and layer contents

https://github.com/wagoodman/dive


hadolint

hadolint

Dockerfile linter, validate inline bash

https://github.com/hadolint/hadolint


ctop

ctop

top-like interface for container metrics

https://github.com/bcicen/ctop


lazydocker

lazydocker

Terminal UI for both docker and docker-compose

https://github.com/jesseduffield/lazydocker

Thanks

https://github.com/heussd/docker-2024