coreos/coreos-assembler

Make `push-container-manifest` idempotent

Closed this issue · 8 comments

We want to be able to run the release job multiple times (in case one run fails in an intermediate stage) and have each stage no-op if there is nothing to do (i.e. don't recreate things that already exist).

We need to update push-container-manifest to do some check here and make sure we don't push a new manifest listed container if the one that is there is correct already. We'd probably make an update somewhere in this area of the code.

I'm thinking we can do something like this:

  • grab the digest from the meta.json for each arch of the toplevel entry for this container (if it exists)
  • if all arches have an entry with a digest in meta.json (implies upload already happened)
    • check the remote container manifest list and see if it already matches each arch digest
      • if it does, do nothing, else continue push operation

When inspecting the remote container we can get the digests by doing something like:

skopeo inspect --raw docker://quay.io/fedora/fedora-coreos:stable | jq .

Some things I'd add to the above:

  1. Very early on, check if meta.json already has the top-level key we're going to insert. If so, no-op.
  2. Add a --force switch to override that check and also the digest check described above.

This makes it more similar to the semantics of other cosa commands.

  1. Very early on, check if meta.json already has the top-level key we're going to insert. If so, no-op.

Oh right. None of them will have the top-level key if the operation failed at all. So yes, we can just check that the toplevel keys exist (in the arch meta.json files) and then no-op out of it. It makes the check much more simple.

I think we still also want what you originally described though in case the release job failed before we were actually able to re-upload the updated meta.json. We would no-op the upload, but not the meta.json update.

(Makes sense but note with #2685 this issue goes away, because the container is the root, the axis around which disk images revolve, the source of truth, and not just another thing pushed out)

We're talking about more than one container here.

The extensions container? Yeah, true. Though not for FCOS. And hopefully at some point we can kill that for RHCOS too.

We'll have a kubevirt container pretty soon too.

Yeah, makes sense; we do want idempotence for that indeed.