containers/buildah

build "--all-platforms" fails for some images

TheLonelyGhost opened this issue · 2 comments

Issue Description

Some manifests in a manifest list show up as platform unknown/unknown with the annotation vnd.docker.reference.type set to attestation-manifest. Under these circumstances, the asset in the OCI manifest is not a container image and attempts to use it are met with error messages like this:

Error: [unknown/unknown]: creating build container: copying system image from manifest list: writing blob: adding layer with blob "sha256:e0b6e8a181f4b1a7f7c87e329b3b86ef96344f55f36d12966a3c9041cb6d78e5": processing tar file(archive/tar: invalid tar header): exit status 1

One example image that is known to reproduce this is docker.io/library/bash:5.

Steps to reproduce the issue

FROM docker.io/library/bash:5

RUN echo hello world

With the above Dockerfile, run podman build --pull=newer --manifest=my-image:latest --all-platforms .. This will result in a rather long build process, ending with an error message like the one below at the end of the build process. The failure to build for this one platform fails the entire podman build command, and therefore no local image manifest is created.

Error: [unknown/unknown]: creating build container: copying system image from manifest list: writing blob: adding layer with blob "sha256:e0b6e8a181f4b1a7f7c87e329b3b86ef96344f55f36d12966a3c9041cb6d78e5": processing tar file(archive/tar: invalid tar header): exit status 1

Describe the results you received

I expected podman build with the --all-platforms command would filter only to platforms in the manifest list that were valid container images, silently skipping ones that were not.

Describe the results you expected

In reality, podman build with --all-platforms builds all platforms it can (everything but the unknown/unknown ones), and exits with a status code of 1.

podman info output

host:
  arch: amd64
  buildahVersion: 1.33.3
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.8-2.fc39.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.8, commit: '
  cpuUtilization:
    idlePercent: 99.55
    systemPercent: 0.23
    userPercent: 0.22
  cpus: 4
  databaseBackend: sqlite
  distribution:
    distribution: fedora
    variant: cloud
    version: "39"
  eventLogger: journald
  freeLocks: 2048
  hostname: lima-podman
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 501
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 6.5.6-300.fc39.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 1715515392
  memTotal: 4084289536
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.10.0-1.fc39.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.10.0
    package: netavark-1.10.1-5.fc39.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.10.1
  ociRuntime:
    name: crun
    package: crun-1.14-1.fc39.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.14
      commit: 667e6ebd4e2442d39512e63215e79d693d0780aa
      rundir: /run/user/501/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20231230.gf091893-1.fc39.x86_64
    version: |
      pasta 0^20231230.gf091893-1.fc39.x86_64
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/user/501/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.2-1.fc39.x86_64
    version: |-
      slirp4netns version 1.2.2
      commit: 0ee2d87523e906518d34a6b423271e4826f71faf
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 4084199424
  swapTotal: 4084199424
  uptime: 7h 7m 9.00s (Approximately 0.29 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /home/lima.linux/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/lima.linux/.local/share/containers/storage
  graphRootAllocated: 106214436864
  graphRootUsed: 1672871936
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 140
  runRoot: /run/user/501/containers
  transientStore: false
  volumePath: /home/lima.linux/.local/share/containers/storage/volumes
version:
  APIVersion: 4.9.0
  Built: 1706090847
  BuiltTime: Wed Jan 24 05:07:27 2024
  GitCommit: ""
  GoVersion: go1.21.6
  Os: linux
  OsArch: linux/amd64
  Version: 4.9.0

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

MacOS host running podman through the default podman VM managed by lima.

podman-rootless.lima.yaml below:

# Review and modify the following configuration for Lima instance "podman".
# - To cancel starting Lima, just save this file as an empty file.

# A template to use Podman instead of containerd & nerdctl
# $ limactl start ./podman.yaml
# $ limactl shell podman podman run -it -v $HOME:$HOME --rm docker.io/library/alpine

# To run `podman` on the host (assumes podman-remote is installed):
# $ export CONTAINER_HOST=$(limactl list podman --format 'unix://{{.Dir}}/sock/podman.sock')
# $ podman --remote ...

# To run `docker` on the host (assumes docker-cli is installed):
# $ export DOCKER_HOST=$(limactl list podman --format 'unix://{{.Dir}}/sock/podman.sock')
# $ docker ...

# This template requires Lima v0.8.0 or later
images:
  - location: "https://download.fedoraproject.org/pub/fedora/linux/releases/39/Cloud/x86_64/images/Fedora-Cloud-Base-39-1.5.x86_64.qcow2"
    arch: "x86_64"
    digest: "sha256:ab5be5058c5c839528a7d6373934e0ce5ad6c8f80bd71ed3390032027da52f37"
  - location: "https://download.fedoraproject.org/pub/fedora/linux/releases/39/Cloud/aarch64/images/Fedora-Cloud-Base-39-1.5.aarch64.qcow2"
    arch: "aarch64"
    digest: "sha256:765996d5b77481ca02d0ac06405641bf134ac920cfc1e60d981c64d7971162dc"
mounts:
  - location: "~"
    9p:
      cache: fscache
  - location: "/tmp/lima"
    9p:
      cache: mmap
    writable: true
containerd:
  system: false
  user: false
provision:
  - mode: system
    script: |
      #!/bin/bash
      set -eux -o pipefail
      command -v podman >/dev/null 2>&1 && exit 0
      dnf -y install podman
  - mode: system
    script: |
      #!/bin/bash
      set -eux -o pipefail
      command -v sshfs >/dev/null 2>&1 && exit 0
      dnf install -y sshfs
  - mode: user
    script: |
      #!/bin/bash
      set -eux -o pipefail
      systemctl --user enable --now podman.socket
probes:
  - script: |
      #!/bin/bash
      set -eux -o pipefail
      if ! timeout 30s bash -c "until command -v podman >/dev/null 2>&1; do sleep 3; done"; then
        echo >&2 "podman is not installed yet"
        exit 1
      fi
    hint: See "/var/log/cloud-init-output.log" in the guest
portForwards:
  - guestSocket: "/run/user/{{.UID}}/podman/podman.sock"
    hostSocket: "{{.Dir}}/sock/podman.sock"
message: |
  To run `podman` on the host (assumes podman-remote is installed), run the following commands:
  ------
  podman system connection add lima-{{.Name}} "unix://{{.Dir}}/sock/podman.sock"
  podman system connection default lima-{{.Name}}
  podman{{if eq .HostOS "linux"}} --remote{{end}} run quay.io/podman/hello
  ------

Additional information

N/a

Since podman build --all-platforms/buildah build --all-platforms can not really tell if a platform is real or not, I think the current behaviour is correct. Although I guess it could have created the manifest list for all of the images it successfully built. If the failures became warnings, would that satisfy you, and then only fail if no images were created?

Oh absolutely. Warnings can be ignored, and if others would prefer a "strict mode" where it aborted if it wasn't able to build all platforms, I'm sure a flag could be applied for that behavior. Or a flag for my desired behavior. Either way.

The main problem is that it fails the entire build if there's 1 invalid platform in the list, which I can't really help with some upstream images. My workaround is to use a tool like regctl to talk to the OCI registry API, grab all valid platforms (i.e., filter out unknown/unknown), and shove it into the --platform= flag as a comma-separated string instead. Feels like this should be native behavior for podman/buildah with the --all-platforms flag.