containers/container-selinux

`DOCKER_BUILDKIT=1 docker build` fails on Fedora CoreOS 37.20230122.1.1

eriksjolund opened this issue · 10 comments

Description

DOCKER_BUILDKIT=1 docker build fails on Fedora CoreOS 37.20230122.1.1

I'm pretty sure DOCKER_BUILDKIT=1 docker build worked a few months ago on Fedora CoreOS.
Since then the Fedora CoreOS version is newer and maybe also the docker version.

Steps to reproduce the issue

  1. sudo useradd test1
  2. sudo systemctl start docker
  3. sudo usermod -aG docker test1
  4. sudo machinectl shell test1@
  5. mkdir src
  6. cd src
  7. create the file Dockerfile with this file contents
    FROM docker.io/library/alpine
    RUN touch /root/alpine
    
  8. DOCKER_BUILDKIT=1 docker build --no-cache -t test .

Describe the results you received:

$ DOCKER_BUILDKIT=1 docker build --no-cache -t test .
[+] Building 1.3s (5/5) FINISHED                                                                                                                                                                         
 => [internal] load build definition from Dockerfile                                                                                                                                                0.0s
 => => transferring dockerfile: 96B                                                                                                                                                                 0.0s
 => [internal] load .dockerignore                                                                                                                                                                   0.0s
 => => transferring context: 2B                                                                                                                                                                     0.0s
 => [internal] load metadata for docker.io/library/alpine:latest                                                                                                                                    0.0s
 => CACHED [1/2] FROM docker.io/library/alpine                                                                                                                                                      0.0s
 => ERROR [2/2] RUN touch /root/alpine                                                                                                                                                              0.9s
------
 > [2/2] RUN touch /root/alpine:
#5 0.756 touch: /root/alpine: Permission denied
------
executor failed running [/bin/sh -c touch /root/alpine]: exit code: 1

The same docker buildcommand succeeded after I ran sudo setenforce 0.

AVC

# ausearch --start '17:17:08' --raw > /tmp/raw1

The file /tmp/raw1 contains

type=AVC msg=audit(1675012775.957:3397): avc:  denied  { mmap_zero } for  pid=64085 comm="check" scontext=system_u:system_r:spc_t:s0 tcontext=system_u:system_r:spc_t:s0 tclass=memprotect permissive=1
type=ANOM_PROMISCUOUS msg=audit(1675012776.856:3398): dev=veth5514c69 prom=256 old_prom=0 auid=4294967295 uid=0 gid=0 ses=4294967295AUID="unset" UID="root" GID="root"
type=SYSCALL msg=audit(1675012776.856:3398): arch=c00000b7 syscall=206 success=yes exit=40 a0=f a1=400051e3c0 a2=28 a3=0 items=0 ppid=1 pid=22517 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="dockerd" exe="/usr/bin/dockerd" subj=system_u:system_r:container_runtime_t:s0 key=(null)ARCH=aarch64 SYSCALL=sendto AUID="unset" UID="root" GID="root" EUID="root" SUID="root" FSUID="root" EGID="root" SGID="root" FSGID="root"
type=SOCKADDR msg=audit(1675012776.856:3398): saddr=100000000000000000000000SADDR={ saddr_fam=netlink nlnk-fam=16 nlnk-pid=0 }
type=PROCTITLE msg=audit(1675012776.856:3398): proctitle=2F7573722F62696E2F646F636B657264002D2D686F73743D66643A2F2F002D2D657865632D6F7074006E61746976652E6367726F75706472697665723D73797374656D64002D2D73656C696E75782D656E61626C6564002D2D6C6F672D6472697665723D6A6F75726E616C64002D2D6C6976652D726573746F7265002D2D6465
type=BPF msg=audit(1675012777.344:3399): prog-id=282 op=LOAD
type=BPF msg=audit(1675012777.344:3400): prog-id=283 op=LOAD
type=BPF msg=audit(1675012777.344:3401): prog-id=0 op=UNLOAD
type=BPF msg=audit(1675012777.344:3402): prog-id=0 op=UNLOAD
type=BPF msg=audit(1675012777.355:3403): prog-id=284 op=LOAD
type=AVC msg=audit(1675012777.886:3404): avc:  denied  { write } for  pid=64271 comm="touch" name="root" dev="overlay" ino=14702304 scontext=system_u:system_r:container_t:s0:c813,c996 tcontext=system_u:object_r:container_ro_file_t:s0 tclass=dir permissive=1
type=AVC msg=audit(1675012777.886:3405): avc:  denied  { add_name } for  pid=64271 comm="touch" name="alpine" scontext=system_u:system_r:container_t:s0:c813,c996 tcontext=system_u:object_r:container_ro_file_t:s0 tclass=dir permissive=1
type=AVC msg=audit(1675012777.886:3406): avc:  denied  { create } for  pid=64271 comm="touch" name="alpine" scontext=system_u:system_r:container_t:s0:c813,c996 tcontext=system_u:object_r:container_ro_file_t:s0 tclass=file permissive=1
type=AVC msg=audit(1675012777.886:3407): avc:  denied  { write } for  pid=64271 comm="touch" path="/root/alpine" dev="overlay" ino=4459122 scontext=system_u:system_r:container_t:s0:c813,c996 tcontext=system_u:object_r:container_ro_file_t:s0 tclass=file permissive=1
type=ANOM_PROMISCUOUS msg=audit(1675012778.156:3408): dev=veth5514c69 prom=0 old_prom=256 auid=4294967295 uid=0 gid=0 ses=4294967295AUID="unset" UID="root" GID="root"
type=SYSCALL msg=audit(1675012778.156:3408): arch=c00000b7 syscall=206 success=yes exit=32 a0=f a1=4000af20e0 a2=20 a3=0 items=0 ppid=1 pid=22517 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="dockerd" exe="/usr/bin/dockerd" subj=system_u:system_r:container_runtime_t:s0 key=(null)ARCH=aarch64 SYSCALL=sendto AUID="unset" UID="root" GID="root" EUID="root" SUID="root" FSUID="root" EGID="root" SGID="root" FSGID="root"
type=SOCKADDR msg=audit(1675012778.156:3408): saddr=100000000000000000000000SADDR={ saddr_fam=netlink nlnk-fam=16 nlnk-pid=0 }
type=PROCTITLE msg=audit(1675012778.156:3408): proctitle=2F7573722F62696E2F646F636B657264002D2D686F73743D66643A2F2F002D2D657865632D6F7074006E61746976652E6367726F75706472697665723D73797374656D64002D2D73656C696E75782D656E61626C6564002D2D6C6F672D6472697665723D6A6F75726E616C64002D2D6C6976652D726573746F7265002D2D6465
type=BPF msg=audit(1675012778.351:3409): prog-id=0 op=UNLOAD

audit2allow output

cat /tmp/raw1 | audit2allow shows

#============= container_t ==============
allow container_t container_ro_file_t:dir { add_name write };
allow container_t container_ro_file_t:file { create write };

#============= spc_t ==============

#!!!! This avc can be allowed using the boolean 'mmap_low_allowed'
allow spc_t self:memprotect mmap_zero;

About the system

hardware: macOS Ventura 13.2 (MacBook Pro 13 inch)

software: qemu-system-aarch64 runs the Fedora CoreOS VM.

$ rpm -qf /usr/bin/docker
moby-engine-20.10.22-1.fc37.aarch64
$ rpm -q container-selinux
container-selinux-2.198.0-1.fc37.noarch
$ cat /etc/os-release
NAME="Fedora Linux"
VERSION="37.20230122.1.1 (CoreOS)"
ID=fedora
VERSION_ID=37
VERSION_CODENAME=""
PLATFORM_ID="platform:f37"
PRETTY_NAME="Fedora CoreOS 37.20230122.1.1"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:37"
HOME_URL="https://getfedora.org/coreos/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora-coreos/"
SUPPORT_URL="https://github.com/coreos/fedora-coreos-tracker/"
BUG_REPORT_URL="https://github.com/coreos/fedora-coreos-tracker/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=37
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=37
SUPPORT_END=2023-11-14
VARIANT="CoreOS"
VARIANT_ID=coreos
OSTREE_VERSION='37.20230122.1.1'

It looks like the rootfs of the container is container_ro_file_t rather then container_file_t.

Did you change the location of your /var/lib/docker directory?

With overlayfs the upper directory is supposed to be container_file_t, then files copied up should work fine.

Did you change the location of your /var/lib/docker directory?

No, I think the Fedora CoreOS VM I'm using is pretty standard.
I also reproduced the issue on another Fedora CoreOS VM so I suppose
it should be reproducible everywhere.

I've notified the Fedora CoreOS developers about the issue:
https://discussion.fedoraproject.org/t/selinux-blocks-docker-build-when-running-with-docker-buildkit-1/46293

I reproduced the

touch: /root/alpine: Permission denied

on these versions too:

  • fedora-coreos-36.20220325.1.0-live.aarch64.iso
  • fedora-coreos-37.20221003.1.0-qemu.aarch64.qcow2
  • fedora-coreos-38.20220826.91.0-qemu.aarch64.qcow2

My initial belief that I've used DOCKER_BUILDKIT=1 docker build successfully on Fedora CoreOS before,
is probably just me recalling incorrectly.

Does this happen when you run a simple docker command?
docker run alpine touch /root/alpine

Also want to confirm this works fine with podman.

podman --remote run alpine touch /root/alpine

Does this happen when you run a simple docker command?
docker run alpine touch /root/alpine

It works fine

$ docker run alpine touch /root/alpine
$ echo $?
0
$

Also want to confirm this works fine with podman.
podman --remote run alpine touch /root/alpine

It works fine

$ systemctl --user start podman.socket
$ podman --remote run alpine touch /root/alpine
$ echo $?
0
$ 

Then I think the issue is something to do with docker/buildkit not setting up the overlay directory correctly, or it is using some different path that we do not know about.

Isn't buildkit using its own daemon now? Perhaps it is not setting the label correctly on the upper directory when it is created.

Two days ago I asked the Buildkit slack channel for advice and was told that this issue
is probably related to the issue:

It seems to be some sort of ordering problem.

Quote:
"This works correctly for labeling the process, and for labeling most mounts. However, the new generateSecurityOpts() function is called from oci.GenerateSpec, which only happens after mounting the rootfs."

from moby/buildkit#2320 (comment)


Isn't buildkit using its own daemon now?

Yes, there is a daemon called buildkitd (see for instance "Starting the buildkitd daemon").
I would need to investigate more. Currently I don't know if it's in use.

Should this issue maybe be closed?
There is not much that can be done in container-selinux if this issue is caused by the issue 2320 in BuildKit that has been open for a long time (since Aug 19, 2021).