hauler-dev/hauler

[BUG] Images imported with digests fail to re-appear on the other archival process

Closed this issue · 4 comments

Environmental Info:

  • Darwin kamins-mbp.lan 23.1.0 Darwin Kernel Version 23.1.0: Mon Oct 9 21:32:11 PDT 2023; root:xnu-10002.41.9~7/RELEASE_ARM64_T6030 arm64
  • Also presented on ubuntu and rhel hosts.

Hauler Version:

 __    __       ___       __    __   __       _______ .______
|  |  |  |     /   \     |  |  |  | |  |     |   ____||   _  \
|  |__|  |    /  ^  \    |  |  |  | |  |     |  |__   |  |_)  |
|   __   |   /  /_\  \   |  |  |  | |  |     |   __|  |      /
|  |  |  |  /  _____  \  |  `--'  | |  `----.|  |____ |  |\  \----.
|__|  |__| /__/     \__\  \______/  |_______||_______|| _| `._____|
hauler: Airgap Swiss Army Knife

GitVersion:    devel
GitCommit:     26da333c2af378d9fdc7052c2af8066d36f8f903
GitTreeState:  dirty
BuildDate:     2024-05-28T18:20:32
GoVersion:     go1.22.1
Compiler:      gc
Platform:      darwin/arm64

Describe the Bug:

  • If images are present within a haul that are based on container digest sha checksums they will not be maintained when the haul is saved and then loaded again.
  • The data does remain, but hauler is unable to see the images as they have been removed from the index.json file that is in the resulting haul.

Steps to Reproduce:

  • Example manifest:
apiVersion: content.hauler.cattle.io/v1alpha1
kind: Images
metadata:
  name: hauler-content-images-example
  annotations:
    # global flags for all images in the manifest
    # image flags override global flags
spec:
  images:
    # fetch image
    - name: docker.io/alpine:latest
    - name: docker.io/nginx@sha256:95b01e2e9ab0702ce2f1a8f05f90e6408fd1f4b5e5006c6088ba5a864ed42064
  1. Sync it to a store:
➜ hauler store -s testing sync -f manifest.yaml
2024-06-15 00:03:11 INF syncing [content.hauler.cattle.io/v1alpha1, Kind=Images] to store
2024-06-15 00:03:11 INF adding 'image' [docker.io/alpine:latest] to the store
2024-06-15 00:03:17 INF successfully added 'image' [index.docker.io/library/alpine:latest]
2024-06-15 00:03:17 INF adding 'image' [docker.io/nginx@sha256:95b01e2e9ab0702ce2f1a8f05f90e6408fd1f4b5e5006c6088ba5a864ed42064] to the store
2024-06-15 00:03:41 INF successfully added 'image' [index.docker.io/library/nginx@sha256:95b01e2e9ab0702ce2f1a8f05f90e6408fd1f4b5e5006c6088ba5a864ed42064]
➜  hauler store -s testing info
+---------------------------------------------------------------------------------------+-------+-----------------+----------+----------+
| REFERENCE                                                                             | TYPE  | PLATFORM        | # LAYERS | SIZE     |
+---------------------------------------------------------------------------------------+-------+-----------------+----------+----------+
| library/alpine:latest                                                                 | image | linux/386       |        1 | 3.5 MB   |
|                                                                                       | image | linux/amd64     |        1 | 3.6 MB   |
...
| library/nginx@sha256:95b01e2e9ab0702ce2f1a8f05f90e6408fd1f4b5e5006c6088ba5a864ed42064 | image | linux/386       |        8 | 80.7 MB  |
...
|                                                                                       | image | linux/amd64     |        8 | 82.6 MB  |
+---------------------------------------------------------------------------------------+-------+-----------------+----------+----------+
|                                                                                                                    TOTAL   | 689.5 MB |
+---------------------------------------------------------------------------------------+-------+-----------------+----------+----------+
  1. Save the store:
➜  hauler store -s testing save -f testing.tar.zst
2024-06-15 00:05:46 INF saved store [testing] -> [/Users/kamin/dev/hauler-test/testing.tar.zst]
  1. Move the store:
➜   mv testing.tar.zst testing-move.tar.zst
  1. Load the haul:
➜  hauler store -s testing-move load testing-move.tar.zst
2024-06-15 00:07:56 INF loading content from [testing-move.tar.zst] to [testing-move]
  1. Notice that the haul is missing the nginx entry:
➜  hauler store -s testing-move info
+-----------------------+-------+---------------+----------+---------+
| REFERENCE             | TYPE  | PLATFORM      | # LAYERS | SIZE    |
+-----------------------+-------+---------------+----------+---------+
| library/alpine:latest | image | linux/386     |        1 | 3.5 MB  |
|                       | image | linux/amd64   |        1 | 3.6 MB  |
|                       | image | linux/arm     |        1 | 3.4 MB  |
|                       | image | linux/arm     |        1 | 3.1 MB  |
|                       | image | linux/arm64   |        1 | 4.1 MB  |
|                       | image | linux/ppc64le |        1 | 3.6 MB  |
|                       | image | linux/riscv64 |        1 | 3.4 MB  |
|                       | image | linux/s390x   |        1 | 3.5 MB  |
+-----------------------+-------+---------------+----------+---------+
|                                                  TOTAL   | 28.0 MB |
+-----------------------+-------+---------------+----------+---------+
➜  du -hs testing-move
658M    testing-move
➜  du -hs testing
668M    testing-move

Note taking a look at the index for the first haul:

➜  cat testing/index.json | jq .
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "manifests": [
    {
      "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
      "digest": "sha256:77726ef6b57ddf65bb551896826ec38bc3e53f75cdde31354fbffb4f25238ebd",
      "size": 1853,
      "annotations": {
        "kind": "dev.cosignproject.cosign/imageIndex",
        "org.opencontainers.image.ref.name": "library/alpine:latest"
      }
    },
    {
      "mediaType": "application/vnd.oci.image.index.v1+json",
      "digest": "sha256:95b01e2e9ab0702ce2f1a8f05f90e6408fd1f4b5e5006c6088ba5a864ed42064",
      "size": 10272,
      "annotations": {
        "kind": "dev.cosignproject.cosign/imageIndex",
        "org.opencontainers.image.ref.name": "library/nginx@sha256:95b01e2e9ab0702ce2f1a8f05f90e6408fd1f4b5e5006c6088ba5a864ed42064"
      }
    }
  ]
}
  • And the unpacked haul:
➜  cat testing-move/index.json|jq .
{
  "schemaVersion": 2,
  "manifests": [
    {
      "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
      "digest": "sha256:77726ef6b57ddf65bb551896826ec38bc3e53f75cdde31354fbffb4f25238ebd",
      "size": 1853,
      "annotations": {
        "kind": "dev.cosignproject.cosign/imageIndex",
        "org.opencontainers.image.ref.name": "library/alpine:latest"
      }
    }
  ]
}
  • You'll notice that its simply missing the entire metadata for the nginx image that was in the haul.

Expected Behavior:

  • All images are present in hauler store info and are served via hauler store serve registry when they are uncompressed from a tar.zst

Actual Behavior:

  • If you place an image into a haul that uses a sha256 digest as an identifier it will go missing when the haul is unpacked on the other side of the airgap. The contents are still present but the images are not present in the index.json file and are not served via the registry.

Additional Context:

Temporary Fix:

  • If you move you transport the index from the previous haul with you when moving the archive you can replace it inline and it works correctly:
➜  hauler-test ls -alh testing/index.json testing-move/index.json
-rw-r--r--  1 kamin  staff   323B Jun 15 00:07 testing-move/index.json
-rwxr-xr-x  1 kamin  staff   714B Jun 15 00:03 testing/index.json
➜  hauler-test mv testing/index.json testing-move/index.json
➜  hauler-test ls -alh testing/index.json testing-move/index.json
ls: testing/index.json: No such file or directory
-rwxr-xr-x  1 kamin  staff   714B Jun 15 00:03 testing-move/index.json
➜  hauler-test hauler store -s testing-move info
+---------------------------------------------------------------------------------------+-------+-----------------+----------+----------+
| REFERENCE                                                                             | TYPE  | PLATFORM        | # LAYERS | SIZE     |
+---------------------------------------------------------------------------------------+-------+-----------------+----------+----------+
| library/alpine:latest                                                                 | image | linux/386       |        1 | 3.5 MB   |
...
| library/nginx@sha256:95b01e2e9ab0702ce2f1a8f05f90e6408fd1f4b5e5006c6088ba5a864ed42064 | image | linux/386       |        8 | 80.7 MB  |
...
+---------------------------------------------------------------------------------------+-------+-----------------+----------+----------+
|                                                                                                                    TOTAL   | 689.5 MB |
+---------------------------------------------------------------------------------------+-------+-----------------+----------+----------+

I'm going to take a look at the code base to see if I can figure out whats going on with this and if I can submit a simple fix. But just wanted it on the public radar.

@KaminFay thank you for submitting this issue. Unfortunately and fortunately, I was able to reproduce this issue and we will start to look at releasing as fix for it as well.

@KaminFay thank you for submitting this issue. Unfortunately and fortunately, I was able to reproduce this issue and we will start to look at releasing as fix for it as well.

Just wanted to follow up on this, figured I'd spend some time to see if I can figure out whats going on here. I may be off base but I think my fix is working and I should have a PR for you guys tomorrow. But heres the gist of it:

  • The issue stems from the split happening in this pusher function here:
var baseRef, hash string
parts := strings.SplitN(ref, "@", 2)
baseRef = parts[0]
if len(parts) > 1 {
    hash = parts[1]
}
  • When regular items are pushed their ref string usual looks like this (2 parts, ref and hash):
sha256:b89d9c93e9ed3597455c90a0b88a8bbb5cb7188438f70953fede212a0c4394e0-library/alpine:latest-dev.cosignproject.cosign/imageIndex@sha256:b89d9c93e9ed3597455c90a0b88a8bbb5cb7188438f70953fede212a0c4394e0

And when split:

baseRef: "sha256:b89d9c93e9ed3597455c90a0b88a8bbb5cb7188438f70953fede212a0c4394e0-library/alpine:latest-dev.cosignproject.cosign/imageIndex",
hash: "sha256:b89d9c93e9ed3597455c90a0b88a8bbb5cb7188438f70953fede212a0c4394e0",
  • Basically when items with the digest are pushed their ref string can sometimes be a bit longer (3 parts ref, more stuff, hash)
sha256:95b01e2e9ab0702ce2f1a8f05f90e6408fd1f4b5e5006c6088ba5a864ed42064-library/nginx@sha256:95b01e2e9ab0702ce2f1a8f05f90e6408fd1f4b5e5006c6088ba5a864ed42064-dev.cosignproject.cosign/imageIndex@sha256:95b01e2e9ab0702ce2f1a8f05f90e6408fd1f4b5e5006c6088ba5a864ed42064
  • So when you split that it creates a really long "hash":
baseRef: "sha256:95b01e2e9ab0702ce2f1a8f05f90e6408fd1f4b5e5006c6088ba5a864ed42064-library/nginx"
hash: "sha256:95b01e2e9ab0702ce2f1a8f05f90e6408fd1f4b5e5006c6088ba5a864ed42064-dev.cosignproject.cosign/imageIndex@sha256:95b01e2e9ab0702ce2f1a8f05f90e6408fd1f4b5e5006c6088ba5a864ed42064"

Proposed Solution:

  • I'll have the PR out to you guys tomorrow, but if I'm on base with my assumptions here we just need to simply adjust the split of the string on @ to split on the last instance of it in the string not the first:
	parts := strings.Split(ref, "@")
	baseRef = strings.Join(parts[:len(parts)-1], "@")
	if len(parts) > 1 {
		hash = parts[len(parts)-1]
	}

Clearly took me a minute to get the signing / commit log fixed on that one. Sorry about the spam on that one. Let me know if this is the solution you'd be looking for or if Im looking at the problem wrong.

Closing as resolved per PR #259!