Ownership not preserved in resulting tarball
ppmathis opened this issue · 3 comments
This is probably somewhat related to the fixed and closed #501, but in my own attempts to use melange + apko together it was impossible for me to build a package with melange where certain paths have custom ownership. Based on my current understanding, the issue should be within melange and apko is not to blame, but just to be sure, I included a full example as a reproducer.
My first attempt was running the latest Podman Desktop release on macOS 14.5 (MacBook Pro M1), but there I was not even able to use chown
within the melange pipeline. While the commands ran successfully, the changes have not even been persisted within the same runs
step, meaning that chown
followed by ls
(as done in [1]
and [2]
in the reproducer) was never showing any changes. I think the file system used during builds was somehow not able to work together with Podman, so I put the blame on the differences to Docker, especially since I do not currently have a deep understanding of the Melange build process.
My second attempt, after removing Podman completely from the system, was to instead use Docker Desktop. Here the initial attempt looked way more promising, as running ls
after chown
or chmod
now properly shows the changes, both for [1]
and [2]
in the melange reproducer config.
Unfortunately, the changed ownership is not being reflected in the final output of the melange tar archive. I read through the previously linked issue and saw the adjusted tarball emitter for the data archive, so I would have assumed that tar --list --numeric-owner -tf packages/aarch64/ownership-1.0.0-r0.apk
will show the proper UIDs/GIDs, except for the build user with 1000:1000
, which should be remapped to UID/GID 0:0
, but I get this output instead:
$ tar --list --numeric-owner -tf packages/aarch64/ownership-1.0.0-r0.apk
-rwxrwxrwx 0 0 0 512 Jan 1 1970 .SIGN.RSA.melange.rsa.pub
-rw-r--r-- 0 0 0 211 Jan 1 1970 .PKGINFO
drwxr-xr-x 0 501 20 0 Jan 1 1970 ownership
drwx------ 0 501 20 0 Jan 1 1970 ownership/dir
-rwx------ 0 501 20 0 Jan 1 1970 ownership/file
drwxr-xr-x 0 501 20 0 Jan 1 1970 var
drwxr-xr-x 0 501 20 0 Jan 1 1970 var/lib
drwxr-xr-x 0 501 20 0 Jan 1 1970 var/lib/db
drwxr-xr-x 0 501 20 0 Jan 1 1970 var/lib/db/sbom
-rw-r--r-- 0 501 20 1135 Jan 1 1970 var/lib/db/sbom/ownership-1.0.0-r0.spdx.json
The UID/GID combination of 501:20
matches the UID/GID on my MacOS host system, where this happens to be the primary and active user. If I install this package using apko for building a container image from it, all ownership information seems lost:
$ docker run -it --rm ownership:latest-arm64 ls -lan /ownership
total 12
drwxr-xr-x 3 0 0 4096 Jan 1 1970 .
drwxr-xr-x 1 0 0 4096 Jul 21 20:48 ..
drwx------ 2 0 0 4096 Jan 1 1970 dir
-rwx------ 1 0 0 0 Jan 1 1970 file
I also double-checked the tar archive from apko with dive
, but had the same findings - there is simply no ownership information present on these files anymore, but the permissions were kept.
Based on my understanding, I would consider this a bug and it does not match my own expected behaviour. While I can use paths
in apko, it seems like a bad practice to do so, as these paths are concerned to the package itself, and when e.g. combining multiple packages as a service-bundle in apko, I do not want to repeat package-specific path configs across multiple files.
Version Info
melange
GitVersion: 0.11.2
GitCommit: brew
GitTreeState: clean
BuildDate: 2024-07-19T01:42:34Z
GoVersion: go1.22.5
Compiler: gc
Platform: darwin/arm64
apko
GitVersion: 0.14.7
GitCommit: brew
GitTreeState: clean
BuildDate: 2024-05-31T16:54:55Z
GoVersion: go1.22.3
Compiler: gc
Platform: darwin/arm64
Docker
Client:
Version: 27.0.3
API version: 1.46
Go version: go1.21.11
Git commit: 7d4bcd8
Built: Fri Jun 28 23:59:41 2024
OS/Arch: darwin/arm64
Context: desktop-linux
Server: Docker Desktop 4.32.0 (157355)
Engine:
Version: 27.0.3
API version: 1.46 (minimum version 1.24)
Go version: go1.21.11
Git commit: 662f78c
Built: Sat Jun 29 00:02:44 2024
OS/Arch: linux/arm64
Experimental: false
containerd:
Version: 1.7.18
GitCommit: ae71819c4f5e67bb4d5ae76a6b735f29cc25774e
runc:
Version: 1.7.18
GitCommit: v1.1.13-0-g58aa920
docker-init:
Version: 0.19.0
GitCommit: de40ad0
Reproducer
Steps
melange keygen
melange build --signing-key melange.rsa
apko build -k melange.rsa.pub apko.yaml ownership ownership.tar
docker load < ownership.tar
docker run -it --rm ownership:latest-arm64 ls -lan /ownership
- Optionally inspect with further tools, e.g.
dive --source docker-archive ownership.tar
While the permission changes with chmod
are respected, the ownership changes are lost, and both /ownership/file
as well as /ownership/dir
show up as being owned by root.
melange.yaml
package:
name: ownership
version: 1.0.0
epoch: 0
target-architecture:
- aarch64
environment:
accounts:
users:
- username: nonroot
uid: 65532
gid: 65532
groups:
- groupname: nonroot
gid: 65532
contents:
repositories:
- https://packages.wolfi.dev/os
keyring:
- https://packages.wolfi.dev/os/wolfi-signing.rsa.pub
packages:
- wolfi-baselayout
- busybox
pipeline:
- runs: |
readonly TARGET="${{targets.destdir}}"
mkdir -p "${TARGET}/ownership/dir"
touch "${TARGET}/ownership/file"
chmod 700 "${TARGET}/ownership/dir" "${TARGET}/ownership/file"
chown nonroot:nonroot "${TARGET}/ownership/file" "${TARGET}/ownership/dir"
# [1] This ls command properly shows the changed ownership and permissions
ls -lan "${TARGET}/ownership"
- runs: |
# [2] This ls command properly shows the changed ownership and permissions
ls -lan "${{targets.destdir}}/ownership"
apko.yaml
archs:
- aarch64
contents:
repositories:
- https://packages.wolfi.dev/os
- '@local packages'
keyring:
- https://packages.wolfi.dev/os/wolfi-signing.rsa.pub
packages:
- wolfi-baselayout
- busybox
- ownership@local
accounts:
run-as: nonroot
users:
- username: nonroot
uid: 65532
gid: 65532
groups:
- groupname: nonroot
gid: 65532
So, I've been curious about the root cause and spent some time analysing this myself, and came to the conclusion that the root cause is more on the apko side, but there are multiple reasons why this is failing as of today. I first started to analyze the situation when melange builds a tarball:
- While running
melange build
, a temporary directory is mounted to/home/build
, and this is where the first issue happens. To avoid permission issues on the host systems, Docker Desktop is not preserving the actual UID/GID, and instead lets that default to the user account on the host, but uses an xattr namedcom.docker.grpcfuse.ownership
which will contain a JSON value like{"UID":123,"GID":123,"mode":770}
, which stores the real ownership and permissions. - This xattr is being completely ignored though once melange builds the tarball, as writeTar only checks the actual file system permissions, therefor leading to my host account
501:20
leaking into the tarball itself. - The
memFS
implementation used by apko ignores the UID/GID values from tar headers completely and callingmemFS.Chown()
would be necessary to preserve permissions
I was able to get everything working correctly on apko's side by:
- Implementing a new
tarball.Option
which enables the processing of xattr-based permissions for gRPC FUSE - Extending
Context.writeTar
to check for this option and, if enabled, read the xattr for every entry being processed and set the UID/GID/Mode solely based on this xattr. If this xattr is missing or parsing fails, the UID/GID get reset to 0, so the host user does not leak through. - Extending
memFS.WriteHeader
to callmemFS.Chown()
for both directories and files if either UID or GID is != 0 in the header of the entry being processed
You can find both of these changes in my forked apko branch fs-ownership
with these two commits:
Once apko has been extended with the above two commits, melange itself can be adjusted to use the gRPC FUSE ownership option based on its environment, as seen on the fs-ownership
branch in my melange fork:
I did not implement any heuristics / detection for this yet, as it's simply meant as a PoC, but I suppose it could be based on the runner environment - e.g. if Docker is used on macOS, then use gRPC FUSE based permissions. With these changes to both apko and melange I'm finally able to build both a clean APK tarball (preserving custom permissions, while not leaking host account UID/GID through) and a clean OCI image with proper permissions as well.
Now my question: Would changes like these be acceptable to the melange/apko team? Or has this not been implemented on purpose so far for some reason unknown to me?
I've also tested this out on a Linux system by now, to have the full picture in terms of ownership support. Here is a summarised breakdown of everything:
- melange @ macOS: ❌ The resulting TAR archive contains invalid UID/GID and instead leaks the host account through, as the actual UID/GID is stored in an xattr named
com.docker.grpcfuse.ownership
. Would require heuristics to determine an environment like this, and then forcibly use only the xattr for UID/GID. - melange @ Linux: ✅ The resulting TAR archive contains valid ownership out of the box, thanks to #501 / #781 from @amouat / @epsilon-phase
- apko: ❌ Never respects the UID/GID present in the TAR archive, and only copies all the files as 0:0 into the resulting container image, essentially making any efforts on melange's side for proper ownership useless.
The question that remains on my side is why apko does not preserve ownership (implementation is trivial, as shown in my previous comment with ppmathis/apko@dd8c452) and instead forces 0:0. The melange bug on macOS makes sense, it's a special case, but why would apko ignore ownership?
The question that remains on my side is why apko does not preserve ownership
If I had to guess, it's not deliberate, just an oversight. Since apko
has the ability to set ownership explicitly, I suspect we've been doing this at the image level and not at the apk level.
Feel free to PR that change to apko. It would be good to have a test that checks this and ensures it's consistent with /lib/apk/db/installed
as well.