How to build multi platform image in different runners
miki725 opened this issue · 18 comments
Troubleshooting
Before submitting a bug report please read the Troubleshooting doc.
✅ went through the doc
Behaviour
I am building the same image on different runners:
runs-on: ${{ matrix.builder }}
strategy:
matrix:
include:
- builder: ubuntu-latest
platform: linux/amd64
- builder: buildjet-8vcpu-ubuntu-2204-arm
platform: linux/arm64
reason for multiple runners is that QEMU emulation cant compile clang-tidy
on-time with regular ubuntu-latest
runners and the job times out after 6 hours.
Docker build section does include which platform it is building:
- name: Build and Push Docker Image
uses: docker/build-push-action@v2
with:
...
platforms: ${{ matrix.platform }}
...
Each runner then builds its own platform
and pushes to github container registry. However the final image when you pull from github container registry is always for a single architecture and does not have any manifests:
➜ docker buildx imagetools inspect ghcr.io/crashappsec/clang-tidy:14.0.6
Name: ghcr.io/crashappsec/clang-tidy:14.0.6
MediaType: application/vnd.docker.distribution.manifest.v2+json
Digest: sha256:bcd9f4a8a798f758d9a908b2541f437473f842346293ec4c95ced40105265d2c
What is the correct way to build multi-platform image on different runners. I dont see any appropriate flag for that in the README
Steps to reproduce this issue
- build and push image from different runners each
- check the package if its multi-platform
Expected behaviour
Final github package should be multi-platform
Actual behaviour
In my case since ARM build takes longer its the last push and the final image is ARM-only.
Configuration
- Repository URL (if public): https://github.com/crashappsec/docker-clang-tidy
- Build URL (if public): https://github.com/crashappsec/docker-clang-tidy/actions/runs/2847260456
- Workflow: https://github.com/crashappsec/docker-clang-tidy/blob/9ac0b75796fec6b0a4531002f6b91c9107542e75/.github/workflows/release.yml
name: docker image
on:
workflow_dispatch:
inputs:
CLANG_TIDY_VERSION:
description: "clang-tidy version"
required: true
default: "14.0.6"
type: string
jobs:
release:
runs-on: ${{ matrix.builder }}
strategy:
matrix:
include:
- builder: ubuntu-latest
platform: linux/amd64
- builder: buildjet-8vcpu-ubuntu-2204-arm
platform: linux/arm64
steps:
- name: Checkout Code
uses: actions/checkout@v1
# - name: Set up QEMU
# uses: docker/setup-qemu-action@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Login to GitHub Container Registry
uses: docker/login-action@v1
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ github.token }}
- name: Build and Push Docker Image
uses: docker/build-push-action@v2
with:
file: Dockerfile
push: true
platforms: ${{ matrix.platform }}
build-args: |
CLANG_TIDY_VERSION=${{ inputs.CLANG_TIDY_VERSION }}
tags: |
ghcr.io/${{ github.repository_owner }}/clang-tidy:latest
ghcr.io/${{ github.repository_owner }}/clang-tidy:${{ inputs.CLANG_TIDY_VERSION }}
Logs
Found an action which can manually push the manifest however it would be nice if there would be an official way of doing that via the build action directly
I have the same issue. Any update on the matter ?
Just tried to use a matrix to build my ResourceSpace container in six different pipelines, as it saves a lot of time for each image to finish. But as reported, each finished pipeline overwrites the result of the priorly finished one and I end up with just the riscv64 image. It would be great if the tags section could support some kind of platform/architecture tagging.
Just to demonstrate the benefit of proper matrix support:
@crazy-max - any thoughts on this?
As ARM builds become more popular, the QEMU based builds are just way too painfully slow. Having proper matrix support here would be fantastic!
I was running into this problem as well and after a bit of searching, I landed on building/pushing each of my desired architectures and then creating the manifest in a subsequent job for the workflow. I have an ARM device and an amd64 device so I build on the appropriate architecture which seems to rule out using matrix builds.
It's not nearly as clean as it looked when I was using the matrix approach... but it seems to work alright.
jobs:
build-push-arm7:
runs-on: [self-hosted, ARM64]
steps:
- uses: actions/checkout@v3
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Context for Buildx
id: buildx-context
run: |
docker context create builders
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
with:
version: latest
endpoint: builders
- name: Login to Docker Hub
uses: docker/login-action@v2
with:
...
- name: Build and push
uses: docker/build-push-action@v3
with:
context: .
file: ./Dockerfile
platforms: linux/arm/v7
tags: |
my-registry/my-repo:latest-armv7
my-registry/my-repo:${{ github.sha }}-armv7
build-push-arm64:
runs-on: [self-hosted, ARM64]
steps:
- uses: actions/checkout@v3
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Context for Buildx
id: buildx-context
run: |
docker context create builders
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
with:
version: latest
endpoint: builders
- name: Login to Docker Hub
uses: docker/login-action@v2
with:
...
- name: Build and push
uses: docker/build-push-action@v3
with:
context: .
file: ./Dockerfile
platforms: linux/arm64
tags: |
my-registry/my-repo:latest-arm64
my-registry/my-repo:${{ github.sha }}-arm64
build-push-x64:
runs-on: [self-hosted, X64]
steps:
- uses: actions/checkout@v3
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Context for Buildx
id: buildx-context
run: |
docker context create builders
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
with:
version: latest
endpoint: builders
- name: Login to Docker Hub
uses: docker/login-action@v2
with:
...
- name: Build and push
uses: docker/build-push-action@v3
with:
context: .
file: ./Dockerfile
platforms: linux/amd64
tags: |
my-registry/my-repo:latest-amd64
my-registry/my-repo:${{ github.sha }}-amd64
create-manifests:
runs-on: [self-hosted]
needs: [build-push-x64, build-push-arm7, build-push-arm64]
steps:
- uses: actions/checkout@v3
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Login to Docker Hub
uses: docker/login-action@v2
with:
...
- name: Create SHA manifest and push
run: |
docker manifest create \
my-registry/my-repo:${{ github.sha }} \
--amend my-registry/my-repo:${{ github.sha }}-amd64 \
--amend my-registry/my-repo:${{ github.sha }}-armv7 \
--amend my-registry/my-repo:${{ github.sha }}-arm64
docker manifest push my-registry/my-repo:${{ github.sha }}
- name: Create latest manifest and push
run: |
docker manifest create \
my-registry/my-repo:latest \
--amend my-registry/my-repo:latest-amd64 \
--amend my-registry/my-repo:latest-armv7 \
--amend my-registry/my-repo:latest-arm64
docker manifest push my-registry/my-repo:latest
https://dev.to/aws-builders/using-docker-manifest-to-create-multi-arch-images-on-aws-graviton-processors-1320 has a good writeup of Chris' style of fixing this. The one thing that sucks for me is that the docker manifest create
steps don't read tags from docker/metadata-action
PS: you can get commercially hosted ARM runners from https://gitrunners.com . I just started trialling them, and they seem to be pretty decent.
We have actuated customers running split builds on native Arm too. My example uses two separate steps followed by a final one to publish a manifest.
What exactly was the issue with the matrix build? The de-duplication is rather nice.
The two issues I encountered are
docker manifest create
is not yet supported by this action, hence needs manual integration into e.g. docker/metadata-action outputs, which is not complicated, but certainly moreso than wiring up existing actionsdocker manifest create
doesn't deal well (at all) with images that already have a manifest, like anything with attestations attached. Turning attestations off is not great, but does allow merging the images into a multi-arch one.
We have published a new documentation to distribute build across runners: https://docs.docker.com/build/ci/github-actions/multi-platform/#distribute-build-across-multiple-runners
Ofc best is to use native nodes to avoid using emulation. This can be done when configuring your builder: https://docs.docker.com/build/ci/github-actions/configure-builder/. See https://github.com/docker/packaging/blob/2c95ad0ca93ea91a01755b01e9a979adec955540/.github/workflows/.release.yml#L68-L89 as an example.
ooo docker buildx imagetools create
looks much more capable than docker manifest create
, thanks for the update!
ooo
docker buildx imagetools create
looks much more capable thandocker manifest create
, thanks for the update!
Yes and also with this workflow it pushes by digest and therefore avoid noisy tags on your registry such as myimage:latest-amd64
, myimage:latest-arm64
and so on.
We have published a new documentation to distribute build across runners: https://docs.docker.com/build/ci/github-actions/multi-platform/#distribute-build-across-multiple-runners
Ofc best is to use native nodes to avoid using emulation. This can be done when configuring your builder: https://docs.docker.com/build/ci/github-actions/configure-builder/. See https://github.com/docker/packaging/blob/2c95ad0ca93ea91a01755b01e9a979adec955540/.github/workflows/.release.yml#L68-L89 as an example.
@crazy-max Is it possible to use caching (cache-from
) with this strategy?
Is it possible to use caching (
cache-from
) with this strategy?
@neilime Sure with something like:
jobs:
build:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
platform:
- linux/amd64
- linux/arm/v6
- linux/arm/v7
- linux/arm64
steps:
-
name: Prepare
run: |
platform=${{ matrix.platform }}
echo "PLATFORM_PAIR=${platform//\//-}" >> $GITHUB_ENV
-
name: Checkout
uses: actions/checkout@v3
-
...
-
name: Build and push by digest
id: build
uses: docker/build-push-action@v4
with:
context: .
platforms: ${{ matrix.platform }}
labels: ${{ steps.meta.outputs.labels }}
outputs: type=image,name=${{ env.REGISTRY_IMAGE }},push-by-digest=true,name-canonical=true,push=true
cache-from: type=gha,scope=build-${{ env.PLATFORM_PAIR }}
cache-to: type=gha,scope=build-${{ env.PLATFORM_PAIR }}
Looking for a easier more encapsulated fix for this. ARM builds on ubuntu latest runner work but take ~ 7 times longer (6 mins vs 43 mins). If I could pass in an input parameter for the runner used for each platform architecture instead of having to setup this more complex digest merge and push it would be supper helpful.
@paulbourelly999 I've since moved all my builds to https://depot.dev/ and have been very satisfied with the results.
Good to hear about your experience @DavidS-ovm - there are a bunch of competing solutions out there now that look like clones of each other, I think we'll see even more of them to come. To update on my comment from June last year - we've been thousands of builds for the CNCF's Arm needs - feel free to check that out: The state of Arm CI for the CNCF.
Looking for a easier more encapsulated fix for this. ARM builds on ubuntu latest runner work but take ~ 7 times longer (6 mins vs 43 mins). If I could pass in an input parameter for the runner used for each platform architecture instead of having to setup this more complex digest merge and push it would be supper helpful.
I solved this issue with using self-hosted github runners.
I have EKS cluster with 2 node-groups, one for x86 and the second for arm64.
I use the matrix strategy to build natively each component on each arch, then i combine their manifests together to create the multi-arch manifest. i will share some samples of the workflow
build-and-push:
strategy:
matrix:
component: "web-backend", "web-front"]
os: ["x64", "arm64"]
runs-on: ${{ matrix.os }}
defaults:
run:
shell: bash
permissions: write-all
steps:
- name: Check out code
uses: actions/checkout@v3
- name: Build ${{ matrix.component }} on ${{ matrix.os }}
run: |
build ${{ matrix.component }} DOCKER_IMAGE_TAG="<YOUR-TAG>-${{ matrix.os }}"
create-multiarch-manifests:
needs: build-and-push
runs-on: ubuntu-22.04
defaults:
run:
shell: bash
steps:
- name: Create multiarch manifests
run: |
for component in web-backend web-front; do
docker buildx imagetools create -t ${{ env.TARGET_IMAGE_PREFIX }}/$component:<YOUR-TAG> \
${{ env.TARGET_IMAGE_PREFIX }}/$component:<YOUR-TAG>-x64 \
${{ env.TARGET_IMAGE_PREFIX }}/$component:<YOUR-TAG>-arm64
done
hope it helps. good luck
I solved this issue with using self-hosted github runners. I have EKS cluster with 2 node-groups, one for x86 and the second for arm64.
How do you actually build the arm64 image/layers on a self-host arm64 agent running on EKS? I assume that docker buildx doesn't work on containers based nodes - i.e. no Docker. But I guess with your approach, the arm64 runner could use something like Kaniko or some other image builder that doesn't need Docker.