SUSE/BCI-tests

15-SP3 init container fails on RHEL 7.9 - container doesn't start

Closed this issue · 7 comments

The tests are failing only on RHEL 7.9:
https://openqa.suse.de/tests/9005682#step/all/1
with

E       AssertionError: Unexpected exit code 126 for CommandResult(command=b'uname -s', exit_status=126, stdout=None, stderr=b'Error: cannot exec into container that is not running: container state improper\n')
E       assert 126 in [0, 1]
E        +  where 126 = CommandResult(command=b'uname -s', exit_status=126, stdout=None, stderr=b'Error: cannot exec into container that is not running: container state improper\n').rc

Environment:

localhost:~/BCI-tests #cat /etc/os-release 
NAME="Red Hat Enterprise Linux Server"
VERSION="7.9 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="7.9"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.9"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.9:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.9
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION=7.9
# This is a "SLES Expanded Support platform release 7.9"
# The above "Red Hat Enterprise Linux Server" string is only used to
# keep software compatibility.
localhost:~/BCI-tests #podman -v
podman version 1.6.4
localhost:~/BCI-tests #podman info
host:
  BuildahVersion: 1.12.0-dev
  CgroupVersion: v1
  Conmon:
    package: conmon-2.0.8-1.el7.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.8, commit: 52580ecb98528a431a28ed597cb453c716bcd70d'
  Distribution:
    distribution: '"rhel"'
    version: "7.9"
  MemFree: 519049216
  MemTotal: 4142096384
  OCIRuntime:
    name: runc
    package: runc-1.0.0-69.rc10.el7_9.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.1-dev'
  SwapFree: 0
  SwapTotal: 0
  arch: amd64
  cpus: 2
  eventlogger: journald
  hostname: localhost.localdomain
  kernel: 3.10.0-1160.66.1.el7.x86_64
  os: linux
  rootless: false
  uptime: 19m 26.71s
registries:
  blocked: null
  insecure: null
  search:
  - registry.access.redhat.com
  - registry.redhat.io
  - docker.io
store:
  ConfigFile: /etc/containers/storage.conf
  ContainerStore:
    number: 2
  GraphDriverName: overlay
  GraphOptions: {}
  GraphRoot: /var/lib/containers/storage
  GraphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 8
  RunRoot: /var/run/containers/storage
  VolumePath: /var/lib/containers/storage/volumes

I have tried to run it manually, and the container doesn't keep running as it should be expected:

localhost:~/BCI-tests #podman --log-level=debug run -d registry.suse.de/suse/sle-15-sp3/update/cr/totest/images/bci/bci-init:15.3 
DEBU[0000] Reading configuration file "/usr/share/containers/libpod.conf" 
DEBU[0000] Merged system config "/usr/share/containers/libpod.conf": &{{false false false false false true} 0 {   [] [] []}  docker://  runc map[crun:[/usr/bin/crun /usr/local/bin/crun] runc:[/usr/bin/runc /usr/sbin/runc /usr/local/bin/runc /usr/local/sbin/runc /sbin/runc /bin/runc /usr/lib/cri-o-runc/sbin/runc /run/current-system/sw/bin/runc]] [crun runc] [crun] [] [/usr/libexec/podman/conmon /usr/local/libexec/podman/conmon /usr/local/lib/podman/conmon /usr/bin/conmon /usr/sbin/conmon /usr/local/bin/conmon /usr/local/sbin/conmon /run/current-system/sw/bin/conmon] [PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] systemd   /var/run/libpod -1 false /etc/cni/net.d/ [/usr/libexec/cni /usr/lib/cni /usr/local/lib/cni /opt/cni/bin] podman []   k8s.gcr.io/pause:3.1 /pause false false  2048 shm    false} 
DEBU[0000] Using conmon: "/usr/bin/conmon"              
DEBU[0000] Initializing boltdb state at /var/lib/containers/storage/libpod/bolt_state.db 
DEBU[0000] Using graph driver overlay                   
DEBU[0000] Using graph root /var/lib/containers/storage 
DEBU[0000] Using run root /var/run/containers/storage   
DEBU[0000] Using static dir /var/lib/containers/storage/libpod 
DEBU[0000] Using tmp dir /var/run/libpod                
DEBU[0000] Using volume path /var/lib/containers/storage/volumes 
DEBU[0000] Set libpod namespace to ""                   
DEBU[0000] [graphdriver] trying provided driver "overlay" 
DEBU[0000] cached value indicated that overlay is supported 
DEBU[0000] cached value indicated that metacopy is not being used 
DEBU[0000] cached value indicated that native-diff is usable 
DEBU[0000] backingFs=xfs, projectQuotaSupported=false, useNativeDiff=true, usingMetacopy=false 
DEBU[0000] Initializing event backend journald          
DEBU[0000] using runtime "/usr/bin/runc"                
WARN[0000] Error initializing configured OCI runtime crun: no valid executable found for OCI runtime crun: invalid argument 
INFO[0000] Found CNI network podman (type=bridge) at /etc/cni/net.d/87-podman-bridge.conflist 
DEBU[0000] parsed reference into "[overlay@/var/lib/containers/storage+/var/run/containers/storage]registry.suse.de/suse/sle-15-sp3/update/cr/totest/images/bci/bci-init:15.3" 
DEBU[0000] parsed reference into "[overlay@/var/lib/containers/storage+/var/run/containers/storage]@80c2354d9be93b2f9d54a056b427c8227d786bfbb9cb651b67c1db902c72179d" 
DEBU[0000] exporting opaque data as blob "sha256:80c2354d9be93b2f9d54a056b427c8227d786bfbb9cb651b67c1db902c72179d" 
DEBU[0000] No hostname set; container's hostname will default to runtime default 
DEBU[0000] Using bridge netmode                         
DEBU[0000] created OCI spec and options for new container 
DEBU[0000] Allocated lock 2 for container 43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708 
DEBU[0000] parsed reference into "[overlay@/var/lib/containers/storage+/var/run/containers/storage]@80c2354d9be93b2f9d54a056b427c8227d786bfbb9cb651b67c1db902c72179d" 
DEBU[0000] exporting opaque data as blob "sha256:80c2354d9be93b2f9d54a056b427c8227d786bfbb9cb651b67c1db902c72179d" 
DEBU[0000] created container "43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708" 
DEBU[0000] container "43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708" has work directory "/var/lib/containers/storage/overlay-containers/43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708/userdata" 
DEBU[0000] container "43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708" has run directory "/var/run/containers/storage/overlay-containers/43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708/userdata" 
DEBU[0000] New container created "43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708" 
DEBU[0000] container "43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708" has CgroupParent "machine.slice/libpod-43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708.scope" 
DEBU[0000] overlay: mount_data=lowerdir=/var/lib/containers/storage/overlay/l/HH5JEWDIZCPGL73P74ZETBUG4T:/var/lib/containers/storage/overlay/l/ZTIQHNZ57B4OQ25ISM2GCLWIBA,upperdir=/var/lib/containers/storage/overlay/914c899a7100f5e5081d440db4586d6e6e4da82d2beebb8555921fa1e5823b3b/diff,workdir=/var/lib/containers/storage/overlay/914c899a7100f5e5081d440db4586d6e6e4da82d2beebb8555921fa1e5823b3b/work,context="system_u:object_r:svirt_sandbox_file_t:s0:c267,c680" 
DEBU[0000] mounted container "43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708" at "/var/lib/containers/storage/overlay/914c899a7100f5e5081d440db4586d6e6e4da82d2beebb8555921fa1e5823b3b/merged" 
DEBU[0000] Created root filesystem for container 43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708 at /var/lib/containers/storage/overlay/914c899a7100f5e5081d440db4586d6e6e4da82d2beebb8555921fa1e5823b3b/merged 
DEBU[0000] Made network namespace at /var/run/netns/cni-12365a23-e6a6-7215-dc00-a1c9de96976d for container 43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708 
INFO[0000] Got pod network &{Name:goofy_easley Namespace:goofy_easley ID:43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708 NetNS:/var/run/netns/cni-12365a23-e6a6-7215-dc00-a1c9de96976d Networks:[] RuntimeConfig:map[podman:{IP: PortMappings:[] Bandwidth:<nil> IpRanges:[]}]} 
INFO[0000] About to add CNI network cni-loopback (type=loopback) 
INFO[0000] Got pod network &{Name:goofy_easley Namespace:goofy_easley ID:43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708 NetNS:/var/run/netns/cni-12365a23-e6a6-7215-dc00-a1c9de96976d Networks:[] RuntimeConfig:map[podman:{IP: PortMappings:[] Bandwidth:<nil> IpRanges:[]}]} 
INFO[0000] About to add CNI network podman (type=bridge) 
DEBU[0000] [0] CNI result: Interfaces:[{Name:cni-podman0 Mac:92:a1:85:99:60:cb Sandbox:} {Name:veth22f31dac Mac:9a:e6:64:56:a7:fe Sandbox:} {Name:eth0 Mac:1a:ff:bf:3d:97:0d Sandbox:/var/run/netns/cni-12365a23-e6a6-7215-dc00-a1c9de96976d}], IP:[{Version:4 Interface:0xc00014fac8 Address:{IP:10.88.0.9 Mask:ffff0000} Gateway:10.88.0.1}], Routes:[{Dst:{IP:0.0.0.0 Mask:00000000} GW:<nil>}], DNS:{Nameservers:[] Domain: Search:[] Options:[]} 
DEBU[0000] /etc/system-fips does not exist on host, not mounting FIPS mode secret 
DEBU[0000] Setting CGroups for container 43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708 to machine.slice:libpod:43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708 
DEBU[0000] reading hooks from /usr/share/containers/oci/hooks.d 
DEBU[0000] reading hooks from /etc/containers/oci/hooks.d 
DEBU[0000] Created OCI spec for container 43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708 at /var/lib/containers/storage/overlay-containers/43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708/userdata/config.json 
DEBU[0000] /usr/bin/conmon messages will be logged to syslog 
DEBU[0000] running conmon: /usr/bin/conmon               args="[--api-version 1 -s -c 43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708 -u 43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708 -r /usr/bin/runc -b /var/lib/containers/storage/overlay-containers/43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708/userdata -p /var/run/containers/storage/overlay-containers/43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708/userdata/pidfile -l k8s-file:/var/lib/containers/storage/overlay-containers/43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708/userdata/ctr.log --exit-dir /var/run/libpod/exits --socket-dir-path /var/run/libpod/socket --log-level debug --syslog --conmon-pidfile /var/run/containers/storage/overlay-containers/43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/lib/containers/storage --exit-command-arg --runroot --exit-command-arg /var/run/containers/storage --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /var/run/libpod --exit-command-arg --runtime --exit-command-arg runc --exit-command-arg --storage-driver --exit-command-arg overlay --exit-command-arg --events-backend --exit-command-arg journald --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708]"
INFO[0000] Running conmon under slice machine.slice and unitName libpod-conmon-43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708.scope 
DEBU[0000] Received: 16797                              
INFO[0000] Got Conmon PID as 16785                      
DEBU[0000] Created container 43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708 in OCI runtime 
DEBU[0000] Starting container 43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708 with command [/usr/lib/systemd/systemd] 
DEBU[0000] Started container 43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708 
43b89d7f70a858c139b6d70912969fdacfe2034acd9d785786a5cc8f6343c708

Seems ok, but looking at the existing containers:

localhost:~/BCI-tests #podman ps --all
CONTAINER ID  IMAGE                                                                       COMMAND               CREATED         STATUS                       PORTS  NAMES
43b89d7f70a8  registry.suse.de/suse/sle-15-sp3/update/cr/totest/images/bci/bci-init:15.3  /usr/lib/systemd/...  5 seconds ago   Exited (255) 5 seconds ago          goofy_easley

podman logs showing:

localhost:~/BCI-tests #podman --log-level=debug logs 43b89d7f70a8
DEBU[0000] Reading configuration file "/usr/share/containers/libpod.conf" 
DEBU[0000] Merged system config "/usr/share/containers/libpod.conf": &{{false false false false false true} 0 {   [] [] []}  docker://  runc map[crun:[/usr/bin/crun /usr/local/bin/crun] runc:[/usr/bin/runc /usr/sbin/runc /usr/local/bin/runc /usr/local/sbin/runc /sbin/runc /bin/runc /usr/lib/cri-o-runc/sbin/runc /run/current-system/sw/bin/runc]] [crun runc] [crun] [] [/usr/libexec/podman/conmon /usr/local/libexec/podman/conmon /usr/local/lib/podman/conmon /usr/bin/conmon /usr/sbin/conmon /usr/local/bin/conmon /usr/local/sbin/conmon /run/current-system/sw/bin/conmon] [PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] systemd   /var/run/libpod -1 false /etc/cni/net.d/ [/usr/libexec/cni /usr/lib/cni /usr/local/lib/cni /opt/cni/bin] podman []   k8s.gcr.io/pause:3.1 /pause false false  2048 shm    false} 
DEBU[0000] Using conmon: "/usr/bin/conmon"              
DEBU[0000] Initializing boltdb state at /var/lib/containers/storage/libpod/bolt_state.db 
DEBU[0000] Using graph driver overlay                   
DEBU[0000] Using graph root /var/lib/containers/storage 
DEBU[0000] Using run root /var/run/containers/storage   
DEBU[0000] Using static dir /var/lib/containers/storage/libpod 
DEBU[0000] Using tmp dir /var/run/libpod                
DEBU[0000] Using volume path /var/lib/containers/storage/volumes 
DEBU[0000] Set libpod namespace to ""                   
DEBU[0000] [graphdriver] trying provided driver "overlay" 
DEBU[0000] cached value indicated that overlay is supported 
DEBU[0000] cached value indicated that metacopy is not being used 
DEBU[0000] cached value indicated that native-diff is usable 
DEBU[0000] backingFs=xfs, projectQuotaSupported=false, useNativeDiff=true, usingMetacopy=false 
DEBU[0000] Initializing event backend journald          
DEBU[0000] using runtime "/usr/bin/runc"                
WARN[0000] Error initializing configured OCI runtime crun: no valid executable found for OCI runtime crun: invalid argument 
INFO[0000] Found CNI network podman (type=bridge) at /etc/cni/net.d/87-podman-bridge.conflist 

Could you try launching the container manually and see if there's anything printed to stdout?

If there's nothing, please try launching it with --systemd=always. It could be that this ancient podman version does not detect systemd correctly.

Same...

localhost:~ #podman run -d --systemd=always registry.suse.de/suse/sle-15-sp3/update/cr/totest/images/bci/bci-init:15.3 
ae1b318b478ddc7fa166962c6b0fa6f3e003722425fe673ebbf05106013f646d
localhost:~ #podman ps --all
CONTAINER ID  IMAGE                                                                       COMMAND               CREATED        STATUS                      PORTS  NAMES
ae1b318b478d  registry.suse.de/suse/sle-15-sp3/update/cr/totest/images/bci/bci-init:15.3  /usr/lib/systemd/...  3 seconds ago  Exited (255) 2 seconds ago         charming_rubin

anything in the logs at all?

anything in the logs at all?

Nothing that tells us the problem:

localhost:~ #podman --log-level=debug logs 6a598f08dcc3
DEBU[0000] Reading configuration file "/usr/share/containers/libpod.conf" 
DEBU[0000] Merged system config "/usr/share/containers/libpod.conf": &{{false false false false false true} 0 {   [] [] []}  docker://  runc map[crun:[/usr/bin/crun /usr/local/bin/crun] runc:[/usr/bin/runc /usr/sbin/runc /usr/local/bin/runc /usr/local/sbin/runc /sbin/runc /bin/runc /usr/lib/cri-o-runc/sbin/runc /run/current-system/sw/bin/runc]] [crun runc] [crun] [] [/usr/libexec/podman/conmon /usr/local/libexec/podman/conmon /usr/local/lib/podman/conmon /usr/bin/conmon /usr/sbin/conmon /usr/local/bin/conmon /usr/local/sbin/conmon /run/current-system/sw/bin/conmon] [PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin] systemd   /var/run/libpod -1 false /etc/cni/net.d/ [/usr/libexec/cni /usr/lib/cni /usr/local/lib/cni /opt/cni/bin] podman []   k8s.gcr.io/pause:3.1 /pause false false  2048 shm    false} 
DEBU[0000] Using conmon: "/usr/bin/conmon"              
DEBU[0000] Initializing boltdb state at /var/lib/containers/storage/libpod/bolt_state.db 
DEBU[0000] Using graph driver overlay                   
DEBU[0000] Using graph root /var/lib/containers/storage 
DEBU[0000] Using run root /var/run/containers/storage   
DEBU[0000] Using static dir /var/lib/containers/storage/libpod 
DEBU[0000] Using tmp dir /var/run/libpod                
DEBU[0000] Using volume path /var/lib/containers/storage/volumes 
DEBU[0000] Set libpod namespace to ""                   
DEBU[0000] [graphdriver] trying provided driver "overlay" 
DEBU[0000] cached value indicated that overlay is supported 
DEBU[0000] cached value indicated that metacopy is not being used 
DEBU[0000] cached value indicated that native-diff is usable 
DEBU[0000] backingFs=xfs, projectQuotaSupported=false, useNativeDiff=true, usingMetacopy=false 
DEBU[0000] Initializing event backend journald          
WARN[0000] Error initializing configured OCI runtime crun: no valid executable found for OCI runtime crun: invalid argument 
DEBU[0000] using runtime "/usr/bin/runc"                
INFO[0000] Found CNI network podman (type=bridge) at /etc/cni/net.d/87-podman-bridge.conflist 

I am running

strace -v -s 1000 -f   podman run -d --systemd=always [registry.suse.de/suse/sle-15-sp3/update/cr/totest/images/bci/bci-init:15.3](http://registry.suse.de/suse/sle-15-sp3/update/cr/totest/images/bci/bci-init:15.3) | tee podman.log

and it gets stuck at some point

[pid 15339] epoll_pwait(10,  <unfinished ...>
[pid 15346] read(3,  <unfinished ...>
[pid 15339] <... epoll_pwait resumed>[], 128, 0, NULL, 1036) = 0
[pid 15338] execve("/usr/lib/systemd/systemd", ["/usr/lib/systemd/systemd"], ["PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", "TERM=xterm", "HOSTNAME=bb042869df5f", "container=podman", "HOME=/root"] <unfinished ...>
[pid 15346] <... read resumed>"0", 512) = 1

The issue was that SELinux was enabled and it was denying the creation of that container:

> type=AVC msg=audit(1656935351.909:206): avc:  denied  { write } for  pid=14986 comm="systemd" name="libpod-9cf633c5d4bce40be3df87bad63a75d6010bd63d844258267fb2f0eadf72b5b4.scope" dev="cgroup" ino=29213 scontext=system_u:system_r:container_t:s0:c351,c611 tcontext=system_u:object_r:cgroup_t:s0 tclass=dir permissive=0
> type=SYSCALL msg=audit(1656935351.909:206): arch=c000003e syscall=83 success=no exit=-13 a0=560223f6b600 a1=1ed a2=7fffa81c24b0 a3=560223f6b640 items=0 ppid=14974 pid=14986 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=3 comm="systemd" exe="/usr/lib/systemd/systemd" subj=system_u:system_r:container_t:s0:c351,c611 key=(null)
> type=PROCTITLE msg=audit(1656935351.909:206): proctitle="/usr/lib/systemd/systemd"

I am closing this as the we are now running the hosts with selinux disabled.