How to build multi platform image in different runners

Question

How to build multi platform image in different runners

miki725 opened this issue 2 years ago · 18 comments

miki725 commented 2 years ago

Troubleshooting

Before submitting a bug report please read the Troubleshooting doc.

✅ went through the doc

Behaviour

I am building the same image on different runners:

    runs-on: ${{ matrix.builder }}

    strategy:
      matrix:
        include:
          - builder: ubuntu-latest
            platform: linux/amd64
          - builder: buildjet-8vcpu-ubuntu-2204-arm
            platform: linux/arm64

reason for multiple runners is that QEMU emulation cant compile clang-tidy on-time with regular ubuntu-latest runners and the job times out after 6 hours.

Docker build section does include which platform it is building:

      - name: Build and Push Docker Image
        uses: docker/build-push-action@v2
        with:
          ...
          platforms: ${{ matrix.platform }}
          ...

Each runner then builds its own platform and pushes to github container registry. However the final image when you pull from github container registry is always for a single architecture and does not have any manifests:

➜ docker buildx imagetools inspect ghcr.io/crashappsec/clang-tidy:14.0.6
Name:      ghcr.io/crashappsec/clang-tidy:14.0.6
MediaType: application/vnd.docker.distribution.manifest.v2+json
Digest:    sha256:bcd9f4a8a798f758d9a908b2541f437473f842346293ec4c95ced40105265d2c

What is the correct way to build multi-platform image on different runners. I dont see any appropriate flag for that in the README

Steps to reproduce this issue

build and push image from different runners each
check the package if its multi-platform

Expected behaviour

Final github package should be multi-platform

Actual behaviour

In my case since ARM build takes longer its the last push and the final image is ARM-only.

Configuration

Repository URL (if public): https://github.com/crashappsec/docker-clang-tidy
Build URL (if public): https://github.com/crashappsec/docker-clang-tidy/actions/runs/2847260456
Workflow: https://github.com/crashappsec/docker-clang-tidy/blob/9ac0b75796fec6b0a4531002f6b91c9107542e75/.github/workflows/release.yml

name: docker image

on:
  workflow_dispatch:
    inputs:
      CLANG_TIDY_VERSION:
        description: "clang-tidy version"
        required: true
        default: "14.0.6"
        type: string

jobs:
  release:
    runs-on: ${{ matrix.builder }}

    strategy:
      matrix:
        include:
          - builder: ubuntu-latest
            platform: linux/amd64
          - builder: buildjet-8vcpu-ubuntu-2204-arm
            platform: linux/arm64

    steps:
      - name: Checkout Code
        uses: actions/checkout@v1

      # - name: Set up QEMU
      #   uses: docker/setup-qemu-action@v2

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2

      - name: Login to GitHub Container Registry
        uses: docker/login-action@v1
        with:
          registry: ghcr.io
          username: ${{ github.repository_owner }}
          password: ${{ github.token }}

      - name: Build and Push Docker Image
        uses: docker/build-push-action@v2
        with:
          file: Dockerfile
          push: true
          platforms: ${{ matrix.platform }}
          build-args: |
            CLANG_TIDY_VERSION=${{ inputs.CLANG_TIDY_VERSION }}
          tags: |
            ghcr.io/${{ github.repository_owner }}/clang-tidy:latest
            ghcr.io/${{ github.repository_owner }}/clang-tidy:${{ inputs.CLANG_TIDY_VERSION }}

Logs

logs_11.zip

Answer 1 · 2022-08-15T03:41:51.000Z

Found an action which can manually push the manifest however it would be nice if there would be an official way of doing that via the build action directly

https://github.com/Noelware/docker-manifest-action

Answer 2 · 2022-09-20T12:37:45.000Z

I have the same issue. Any update on the matter ?

Answer 3 · 2022-11-15T14:58:33.000Z

Just tried to use a matrix to build my ResourceSpace container in six different pipelines, as it saves a lot of time for each image to finish. But as reported, each finished pipeline overwrites the result of the priorly finished one and I end up with just the riscv64 image. It would be great if the tags section could support some kind of platform/architecture tagging.

Just to demonstrate the benefit of proper matrix support:

With matrix:

Without matrix:

Answer 4 · 2022-12-07T21:30:36.000Z

@crazy-max - any thoughts on this?

As ARM builds become more popular, the QEMU based builds are just way too painfully slow. Having proper matrix support here would be fantastic!

Answer 5 · 2023-01-06T02:20:34.000Z

I was running into this problem as well and after a bit of searching, I landed on building/pushing each of my desired architectures and then creating the manifest in a subsequent job for the workflow. I have an ARM device and an amd64 device so I build on the appropriate architecture which seems to rule out using matrix builds.

It's not nearly as clean as it looked when I was using the matrix approach... but it seems to work alright.

jobs:
  build-push-arm7:
    runs-on: [self-hosted, ARM64]

    steps:
      - uses: actions/checkout@v3

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v2

      - name: Set up Docker Context for Buildx
        id: buildx-context
        run: |
          docker context create builders

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
        with:
          version: latest
          endpoint: builders

      - name: Login to Docker Hub
        uses: docker/login-action@v2
        with:
          ...

      - name: Build and push
        uses: docker/build-push-action@v3
        with:
          context: .
          file: ./Dockerfile
          platforms: linux/arm/v7
          tags: |
            my-registry/my-repo:latest-armv7
            my-registry/my-repo:${{ github.sha }}-armv7

  build-push-arm64:
    runs-on: [self-hosted, ARM64]

    steps:
      - uses: actions/checkout@v3

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v2

      - name: Set up Docker Context for Buildx
        id: buildx-context
        run: |
          docker context create builders

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
        with:
          version: latest
          endpoint: builders

      - name: Login to Docker Hub
        uses: docker/login-action@v2
        with:
          ...

      - name: Build and push
        uses: docker/build-push-action@v3
        with:
          context: .
          file: ./Dockerfile
          platforms: linux/arm64
          tags: |
            my-registry/my-repo:latest-arm64
            my-registry/my-repo:${{ github.sha }}-arm64

  build-push-x64:
    runs-on: [self-hosted, X64]

    steps:
      - uses: actions/checkout@v3

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v2

      - name: Set up Docker Context for Buildx
        id: buildx-context
        run: |
          docker context create builders

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
        with:
          version: latest
          endpoint: builders

      - name: Login to Docker Hub
        uses: docker/login-action@v2
        with:
          ...

      - name: Build and push
        uses: docker/build-push-action@v3
        with:
          context: .
          file: ./Dockerfile
          platforms: linux/amd64
          tags: |
            my-registry/my-repo:latest-amd64
            my-registry/my-repo:${{ github.sha }}-amd64

  create-manifests:
    runs-on: [self-hosted]
    needs: [build-push-x64, build-push-arm7, build-push-arm64]

    steps:
      - uses: actions/checkout@v3

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v2

      - name: Login to Docker Hub
        uses: docker/login-action@v2
        with:
          ...

      - name: Create SHA manifest and push
        run: |
          docker manifest create \
            my-registry/my-repo:${{ github.sha }} \
            --amend my-registry/my-repo:${{ github.sha }}-amd64 \
            --amend my-registry/my-repo:${{ github.sha }}-armv7 \
            --amend my-registry/my-repo:${{ github.sha }}-arm64
          docker manifest push my-registry/my-repo:${{ github.sha }}

      - name: Create latest manifest and push
        run: |
          docker manifest create \
            my-registry/my-repo:latest \
            --amend my-registry/my-repo:latest-amd64 \
            --amend my-registry/my-repo:latest-armv7 \
            --amend my-registry/my-repo:latest-arm64
          docker manifest push my-registry/my-repo:latest

Answer 6 · 2023-06-02T14:49:08.000Z

https://dev.to/aws-builders/using-docker-manifest-to-create-multi-arch-images-on-aws-graviton-processors-1320 has a good writeup of Chris' style of fixing this. The one thing that sucks for me is that the docker manifest create steps don't read tags from docker/metadata-action

PS: you can get commercially hosted ARM runners from https://gitrunners.com . I just started trialling them, and they seem to be pretty decent.

Answer 7 · 2023-06-26T15:54:30.000Z

We have actuated customers running split builds on native Arm too. My example uses two separate steps followed by a final one to publish a manifest.

What exactly was the issue with the matrix build? The de-duplication is rather nice.

Answer 8 · 2023-06-27T09:06:45.000Z

The two issues I encountered are

docker manifest create is not yet supported by this action, hence needs manual integration into e.g. docker/metadata-action outputs, which is not complicated, but certainly moreso than wiring up existing actions
docker manifest create doesn't deal well (at all) with images that already have a manifest, like anything with attestations attached. Turning attestations off is not great, but does allow merging the images into a multi-arch one.

Answer 9 · 2023-06-27T09:13:04.000Z

We have published a new documentation to distribute build across runners: https://docs.docker.com/build/ci/github-actions/multi-platform/#distribute-build-across-multiple-runners

Ofc best is to use native nodes to avoid using emulation. This can be done when configuring your builder: https://docs.docker.com/build/ci/github-actions/configure-builder/. See https://github.com/docker/packaging/blob/2c95ad0ca93ea91a01755b01e9a979adec955540/.github/workflows/.release.yml#L68-L89 as an example.

Answer 10 · 2023-06-27T09:15:11.000Z

ooo docker buildx imagetools create looks much more capable than docker manifest create , thanks for the update!

Answer 11 · 2023-06-27T09:18:16.000Z

ooo docker buildx imagetools create looks much more capable than docker manifest create , thanks for the update!

Yes and also with this workflow it pushes by digest and therefore avoid noisy tags on your registry such as myimage:latest-amd64, myimage:latest-arm64 and so on.

Answer 12 · 2023-07-03T14:45:58.000Z

We have published a new documentation to distribute build across runners: https://docs.docker.com/build/ci/github-actions/multi-platform/#distribute-build-across-multiple-runners

Ofc best is to use native nodes to avoid using emulation. This can be done when configuring your builder: https://docs.docker.com/build/ci/github-actions/configure-builder/. See https://github.com/docker/packaging/blob/2c95ad0ca93ea91a01755b01e9a979adec955540/.github/workflows/.release.yml#L68-L89 as an example.

@crazy-max Is it possible to use caching (cache-from) with this strategy?

Answer 13 · 2023-07-04T02:01:13.000Z

Is it possible to use caching (cache-from) with this strategy?

@neilime Sure with something like:

jobs:
  build:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        platform:
          - linux/amd64
          - linux/arm/v6
          - linux/arm/v7
          - linux/arm64
    steps:
      -
        name: Prepare
        run: |
          platform=${{ matrix.platform }}
          echo "PLATFORM_PAIR=${platform//\//-}" >> $GITHUB_ENV
      -
        name: Checkout
        uses: actions/checkout@v3
      -
        ...
      -
        name: Build and push by digest
        id: build
        uses: docker/build-push-action@v4
        with:
          context: .
          platforms: ${{ matrix.platform }}
          labels: ${{ steps.meta.outputs.labels }}
          outputs: type=image,name=${{ env.REGISTRY_IMAGE }},push-by-digest=true,name-canonical=true,push=true
          cache-from: type=gha,scope=build-${{ env.PLATFORM_PAIR }}
          cache-to: type=gha,scope=build-${{ env.PLATFORM_PAIR }}

Answer 14 · 2024-03-26T19:25:14.000Z

Looking for a easier more encapsulated fix for this. ARM builds on ubuntu latest runner work but take ~ 7 times longer (6 mins vs 43 mins). If I could pass in an input parameter for the runner used for each platform architecture instead of having to setup this more complex digest merge and push it would be supper helpful.

Answer 15 · 2024-03-27T08:42:09.000Z

@paulbourelly999 I've since moved all my builds to https://depot.dev/ and have been very satisfied with the results.

Answer 16 · 2024-03-27T09:19:22.000Z

Good to hear about your experience @DavidS-ovm - there are a bunch of competing solutions out there now that look like clones of each other, I think we'll see even more of them to come. To update on my comment from June last year - we've been thousands of builds for the CNCF's Arm needs - feel free to check that out: The state of Arm CI for the CNCF.

Answer 17 · 2024-04-20T08:03:01.000Z

@DavidS-ovm

Looking for a easier more encapsulated fix for this. ARM builds on ubuntu latest runner work but take ~ 7 times longer (6 mins vs 43 mins). If I could pass in an input parameter for the runner used for each platform architecture instead of having to setup this more complex digest merge and push it would be supper helpful.

I solved this issue with using self-hosted github runners.
I have EKS cluster with 2 node-groups, one for x86 and the second for arm64.
I use the matrix strategy to build natively each component on each arch, then i combine their manifests together to create the multi-arch manifest. i will share some samples of the workflow

build-and-push:
    strategy:
      matrix:
        component: "web-backend", "web-front"]
        os: ["x64", "arm64"]

    runs-on: ${{ matrix.os }}
    defaults:
      run:
        shell: bash
    permissions: write-all
    steps:
      - name: Check out code
        uses: actions/checkout@v3

      - name: Build ${{ matrix.component }} on ${{ matrix.os }}
        run: |
            build ${{ matrix.component }} DOCKER_IMAGE_TAG="<YOUR-TAG>-${{ matrix.os }}"

create-multiarch-manifests:
    needs: build-and-push
    runs-on: ubuntu-22.04
    defaults:
      run:
        shell: bash
    steps:      
      - name: Create multiarch manifests
        run: |
          for component in web-backend web-front; do
              docker buildx imagetools create -t ${{ env.TARGET_IMAGE_PREFIX }}/$component:<YOUR-TAG> \
                ${{ env.TARGET_IMAGE_PREFIX }}/$component:<YOUR-TAG>-x64 \
                ${{ env.TARGET_IMAGE_PREFIX }}/$component:<YOUR-TAG>-arm64
          done

hope it helps. good luck

Answer 18 · 2024-05-08T22:25:14.000Z

@dFurman

I solved this issue with using self-hosted github runners. I have EKS cluster with 2 node-groups, one for x86 and the second for arm64.

How do you actually build the arm64 image/layers on a self-host arm64 agent running on EKS? I assume that docker buildx doesn't work on containers based nodes - i.e. no Docker. But I guess with your approach, the arm64 runner could use something like Kaniko or some other image builder that doesn't need Docker.