opencontainers/distribution-spec

Proposal: Cross repo referrers

asafalgawi opened this issue · 8 comments

Currently the referrers API does not specify if referrers are only within the same repo, or they can be cross repositories.
Empiricly testing it seems that API implementations return results only from the same repo.

Having the spec require some means to allow cross repo references (same repo, same namespace, etc.) could allow for better seperation between images and other related artifacts such as SBOMs or signatures.

Could you elaborate more on this? There is an interesting scenario where a client might like to request, say SBOMs or signatures, from a different repository. Basically the discoverability of where to query the referrers for a given subject might be the thing to discuss here which might lead to the API or a pattern and authentication considerations as well.
Couple of options -

  1. Client driven mapping: imageA request referrers from repoB
  2. ImageA referrer response provides a LOCATION header for a repoB (which might even be an entirely different registry)

One of the scenarios I was thinking about is having permission separation.

Most or all registries do not permit pushing of SBOM or similar artifacts to registry without giving full permissions to the repo.

Given the possibility to reference resources outside the repo would allow giving 3rd party vendors ability to push SBOMs for images in a repo without granting permissions to modify existing images.

As it seems the main usecase IMO for referrers is to identify linked image to some parent. But the parent can't be aware of references in some other repo or registry. Maybe an option to query for a reference in the 3rd party registry while specifying full reference to the parent could work. In such way it is up to the caller to know where references might be, and also how to authenticate against it.

The current spec handles the scenario where the client knows the repository that contains additional metadata. Clients would query the referrers API in that alternate repository and receive a response even if the subject manifest does not exist in the repository. Working through that scenario was the main reason for the last few months of delay in the 1.1.0 release cycle.

What isn't supported is querying one repository and getting a list of other repositories that contain referrers. As Sajay mentions, there are access considerations for that. The current auth mechanisms don't really handle knowing all the repositories a user has access to if they have not explicitly requested access.

At a higher level, I'd be concerned if any user on a major cloud registry could associate metadata with trusted images on that same registry. That could allow the image to bypass security checks, e.g. providing an SBOM to old images to falsely indicate they do not have vulnerable libraries, or a VEX report that falsely indicates all of the vulnerabilities are invalid and should be ignored. Rather than increasing security by allowing repositories to be locked down, I worry it would decrease security by allowing other users to associate metadata with content in a locked down repository.

TBH my use case was mainly concerning the approach where you actively refer to the alternate repository.
But I must say the docs to not make it clear that this use case is applicable.

The text:

A registry MUST initially accept an otherwise valid manifest with a subject field that references a manifest that does not exist in the repository, allowing clients to push a manifest and referrers to that manifest in either order.

caused a lot of debate in the spec since many registries want to be allowed to reject a manifest if the subject digest does not exist in the repo. Note that the "initially" term gives registries the option to perform a garbage collection on untagged content.

Other text that allowed registries to return an empty response until the subject digest was pushed was also removed, so registries must include a referrers response even if the subject digest does not exist in the repo.

If this is the case, it means that I can't rely on the fact that the reference will remain there as time passes?

OCI doesn't define GC policies, registries are free to implement their own. Some delete anything over a few hours old, others require tags, others have negotiated lifecycle policies for legal reasons. That's beyond the scope of pushing an artifact to a registry, and you'd need to speak with the individual registry provider to understand their GC policy.

I think the following spec sentence is a direct answer to the original question:

Each descriptor is of an image manifest or index in the same <name> namespace with a subject field that specifies the value of <digest>.

Which means "no, you can't query one repository for referrers in a different repository". For the other scenario, when and how a registry decides to GC content is something you'd need to work out on a registry by registry basis. But the spec does say that if the artifact with the subject exists in the repository, and it can be pushed before the subject manifest exists (potentially never pushing the subject manifest), that the API will work.

I think that's as much as we can do from the distribution-spec, but feel free to keep the discussion going if there is language in the spec that we need to clarify.