docker/build-push-action

How to build multi platform image in different runners

miki725 opened this issue · 18 comments

Troubleshooting

Before submitting a bug report please read the Troubleshooting doc.

✅ went through the doc

Behaviour

I am building the same image on different runners:

    runs-on: ${{ matrix.builder }}

    strategy:
      matrix:
        include:
          - builder: ubuntu-latest
            platform: linux/amd64
          - builder: buildjet-8vcpu-ubuntu-2204-arm
            platform: linux/arm64

reason for multiple runners is that QEMU emulation cant compile clang-tidy on-time with regular ubuntu-latest runners and the job times out after 6 hours.

Docker build section does include which platform it is building:

      - name: Build and Push Docker Image
        uses: docker/build-push-action@v2
        with:
          ...
          platforms: ${{ matrix.platform }}
          ...

Each runner then builds its own platform and pushes to github container registry. However the final image when you pull from github container registry is always for a single architecture and does not have any manifests:

➜ docker buildx imagetools inspect ghcr.io/crashappsec/clang-tidy:14.0.6
Name:      ghcr.io/crashappsec/clang-tidy:14.0.6
MediaType: application/vnd.docker.distribution.manifest.v2+json
Digest:    sha256:bcd9f4a8a798f758d9a908b2541f437473f842346293ec4c95ced40105265d2c

What is the correct way to build multi-platform image on different runners. I dont see any appropriate flag for that in the README

Steps to reproduce this issue

  1. build and push image from different runners each
  2. check the package if its multi-platform

Expected behaviour

Final github package should be multi-platform

Actual behaviour

In my case since ARM build takes longer its the last push and the final image is ARM-only.

Configuration

name: docker image

on:
  workflow_dispatch:
    inputs:
      CLANG_TIDY_VERSION:
        description: "clang-tidy version"
        required: true
        default: "14.0.6"
        type: string

jobs:
  release:
    runs-on: ${{ matrix.builder }}

    strategy:
      matrix:
        include:
          - builder: ubuntu-latest
            platform: linux/amd64
          - builder: buildjet-8vcpu-ubuntu-2204-arm
            platform: linux/arm64

    steps:
      - name: Checkout Code
        uses: actions/checkout@v1

      # - name: Set up QEMU
      #   uses: docker/setup-qemu-action@v2

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2

      - name: Login to GitHub Container Registry
        uses: docker/login-action@v1
        with:
          registry: ghcr.io
          username: ${{ github.repository_owner }}
          password: ${{ github.token }}

      - name: Build and Push Docker Image
        uses: docker/build-push-action@v2
        with:
          file: Dockerfile
          push: true
          platforms: ${{ matrix.platform }}
          build-args: |
            CLANG_TIDY_VERSION=${{ inputs.CLANG_TIDY_VERSION }}
          tags: |
            ghcr.io/${{ github.repository_owner }}/clang-tidy:latest
            ghcr.io/${{ github.repository_owner }}/clang-tidy:${{ inputs.CLANG_TIDY_VERSION }}

Logs

logs_11.zip

Found an action which can manually push the manifest however it would be nice if there would be an official way of doing that via the build action directly

https://github.com/Noelware/docker-manifest-action

acuD1 commented

I have the same issue. Any update on the matter ?

Just tried to use a matrix to build my ResourceSpace container in six different pipelines, as it saves a lot of time for each image to finish. But as reported, each finished pipeline overwrites the result of the priorly finished one and I end up with just the riscv64 image. It would be great if the tags section could support some kind of platform/architecture tagging.

Just to demonstrate the benefit of proper matrix support:

With matrix:
image

Without matrix:
image

@crazy-max - any thoughts on this?

As ARM builds become more popular, the QEMU based builds are just way too painfully slow. Having proper matrix support here would be fantastic!

I was running into this problem as well and after a bit of searching, I landed on building/pushing each of my desired architectures and then creating the manifest in a subsequent job for the workflow. I have an ARM device and an amd64 device so I build on the appropriate architecture which seems to rule out using matrix builds.

Screenshot 2023-01-05 at 8 08 19 PM

It's not nearly as clean as it looked when I was using the matrix approach... but it seems to work alright.

jobs:
  build-push-arm7:
    runs-on: [self-hosted, ARM64]

    steps:
      - uses: actions/checkout@v3

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v2

      - name: Set up Docker Context for Buildx
        id: buildx-context
        run: |
          docker context create builders

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
        with:
          version: latest
          endpoint: builders

      - name: Login to Docker Hub
        uses: docker/login-action@v2
        with:
          ...

      - name: Build and push
        uses: docker/build-push-action@v3
        with:
          context: .
          file: ./Dockerfile
          platforms: linux/arm/v7
          tags: |
            my-registry/my-repo:latest-armv7
            my-registry/my-repo:${{ github.sha }}-armv7

  build-push-arm64:
    runs-on: [self-hosted, ARM64]

    steps:
      - uses: actions/checkout@v3

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v2

      - name: Set up Docker Context for Buildx
        id: buildx-context
        run: |
          docker context create builders

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
        with:
          version: latest
          endpoint: builders

      - name: Login to Docker Hub
        uses: docker/login-action@v2
        with:
          ...

      - name: Build and push
        uses: docker/build-push-action@v3
        with:
          context: .
          file: ./Dockerfile
          platforms: linux/arm64
          tags: |
            my-registry/my-repo:latest-arm64
            my-registry/my-repo:${{ github.sha }}-arm64

  build-push-x64:
    runs-on: [self-hosted, X64]

    steps:
      - uses: actions/checkout@v3

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v2

      - name: Set up Docker Context for Buildx
        id: buildx-context
        run: |
          docker context create builders

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
        with:
          version: latest
          endpoint: builders

      - name: Login to Docker Hub
        uses: docker/login-action@v2
        with:
          ...

      - name: Build and push
        uses: docker/build-push-action@v3
        with:
          context: .
          file: ./Dockerfile
          platforms: linux/amd64
          tags: |
            my-registry/my-repo:latest-amd64
            my-registry/my-repo:${{ github.sha }}-amd64

  create-manifests:
    runs-on: [self-hosted]
    needs: [build-push-x64, build-push-arm7, build-push-arm64]

    steps:
      - uses: actions/checkout@v3

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v2

      - name: Login to Docker Hub
        uses: docker/login-action@v2
        with:
          ...

      - name: Create SHA manifest and push
        run: |
          docker manifest create \
            my-registry/my-repo:${{ github.sha }} \
            --amend my-registry/my-repo:${{ github.sha }}-amd64 \
            --amend my-registry/my-repo:${{ github.sha }}-armv7 \
            --amend my-registry/my-repo:${{ github.sha }}-arm64
          docker manifest push my-registry/my-repo:${{ github.sha }}

      - name: Create latest manifest and push
        run: |
          docker manifest create \
            my-registry/my-repo:latest \
            --amend my-registry/my-repo:latest-amd64 \
            --amend my-registry/my-repo:latest-armv7 \
            --amend my-registry/my-repo:latest-arm64
          docker manifest push my-registry/my-repo:latest

https://dev.to/aws-builders/using-docker-manifest-to-create-multi-arch-images-on-aws-graviton-processors-1320 has a good writeup of Chris' style of fixing this. The one thing that sucks for me is that the docker manifest create steps don't read tags from docker/metadata-action

PS: you can get commercially hosted ARM runners from https://gitrunners.com . I just started trialling them, and they seem to be pretty decent.

We have actuated customers running split builds on native Arm too. My example uses two separate steps followed by a final one to publish a manifest.

What exactly was the issue with the matrix build? The de-duplication is rather nice.

The two issues I encountered are

  1. docker manifest create is not yet supported by this action, hence needs manual integration into e.g. docker/metadata-action outputs, which is not complicated, but certainly moreso than wiring up existing actions
  2. docker manifest create doesn't deal well (at all) with images that already have a manifest, like anything with attestations attached. Turning attestations off is not great, but does allow merging the images into a multi-arch one.

We have published a new documentation to distribute build across runners: https://docs.docker.com/build/ci/github-actions/multi-platform/#distribute-build-across-multiple-runners

Ofc best is to use native nodes to avoid using emulation. This can be done when configuring your builder: https://docs.docker.com/build/ci/github-actions/configure-builder/. See https://github.com/docker/packaging/blob/2c95ad0ca93ea91a01755b01e9a979adec955540/.github/workflows/.release.yml#L68-L89 as an example.

ooo docker buildx imagetools create looks much more capable than docker manifest create , thanks for the update!

ooo docker buildx imagetools create looks much more capable than docker manifest create , thanks for the update!

Yes and also with this workflow it pushes by digest and therefore avoid noisy tags on your registry such as myimage:latest-amd64, myimage:latest-arm64 and so on.

We have published a new documentation to distribute build across runners: https://docs.docker.com/build/ci/github-actions/multi-platform/#distribute-build-across-multiple-runners

Ofc best is to use native nodes to avoid using emulation. This can be done when configuring your builder: https://docs.docker.com/build/ci/github-actions/configure-builder/. See https://github.com/docker/packaging/blob/2c95ad0ca93ea91a01755b01e9a979adec955540/.github/workflows/.release.yml#L68-L89 as an example.

@crazy-max Is it possible to use caching (cache-from) with this strategy?

Is it possible to use caching (cache-from) with this strategy?

@neilime Sure with something like:

jobs:
  build:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        platform:
          - linux/amd64
          - linux/arm/v6
          - linux/arm/v7
          - linux/arm64
    steps:
      -
        name: Prepare
        run: |
          platform=${{ matrix.platform }}
          echo "PLATFORM_PAIR=${platform//\//-}" >> $GITHUB_ENV
      -
        name: Checkout
        uses: actions/checkout@v3
      -
        ...
      -
        name: Build and push by digest
        id: build
        uses: docker/build-push-action@v4
        with:
          context: .
          platforms: ${{ matrix.platform }}
          labels: ${{ steps.meta.outputs.labels }}
          outputs: type=image,name=${{ env.REGISTRY_IMAGE }},push-by-digest=true,name-canonical=true,push=true
          cache-from: type=gha,scope=build-${{ env.PLATFORM_PAIR }}
          cache-to: type=gha,scope=build-${{ env.PLATFORM_PAIR }}

Looking for a easier more encapsulated fix for this. ARM builds on ubuntu latest runner work but take ~ 7 times longer (6 mins vs 43 mins). If I could pass in an input parameter for the runner used for each platform architecture instead of having to setup this more complex digest merge and push it would be supper helpful.

@paulbourelly999 I've since moved all my builds to https://depot.dev/ and have been very satisfied with the results.

Good to hear about your experience @DavidS-ovm - there are a bunch of competing solutions out there now that look like clones of each other, I think we'll see even more of them to come. To update on my comment from June last year - we've been thousands of builds for the CNCF's Arm needs - feel free to check that out: The state of Arm CI for the CNCF.

@DavidS-ovm

Looking for a easier more encapsulated fix for this. ARM builds on ubuntu latest runner work but take ~ 7 times longer (6 mins vs 43 mins). If I could pass in an input parameter for the runner used for each platform architecture instead of having to setup this more complex digest merge and push it would be supper helpful.

I solved this issue with using self-hosted github runners.
I have EKS cluster with 2 node-groups, one for x86 and the second for arm64.
I use the matrix strategy to build natively each component on each arch, then i combine their manifests together to create the multi-arch manifest. i will share some samples of the workflow

build-and-push:
    strategy:
      matrix:
        component: "web-backend", "web-front"]
        os: ["x64", "arm64"]

    runs-on: ${{ matrix.os }}
    defaults:
      run:
        shell: bash
    permissions: write-all
    steps:
      - name: Check out code
        uses: actions/checkout@v3

      - name: Build ${{ matrix.component }} on ${{ matrix.os }}
        run: |
            build ${{ matrix.component }} DOCKER_IMAGE_TAG="<YOUR-TAG>-${{ matrix.os }}" 
create-multiarch-manifests:
    needs: build-and-push
    runs-on: ubuntu-22.04
    defaults:
      run:
        shell: bash
    steps:      
      - name: Create multiarch manifests
        run: |
          for component in web-backend web-front; do
              docker buildx imagetools create -t ${{ env.TARGET_IMAGE_PREFIX }}/$component:<YOUR-TAG> \
                ${{ env.TARGET_IMAGE_PREFIX }}/$component:<YOUR-TAG>-x64 \
                ${{ env.TARGET_IMAGE_PREFIX }}/$component:<YOUR-TAG>-arm64
          done

hope it helps. good luck

@dFurman

I solved this issue with using self-hosted github runners. I have EKS cluster with 2 node-groups, one for x86 and the second for arm64.

How do you actually build the arm64 image/layers on a self-host arm64 agent running on EKS? I assume that docker buildx doesn't work on containers based nodes - i.e. no Docker. But I guess with your approach, the arm64 runner could use something like Kaniko or some other image builder that doesn't need Docker.