sigstore/cosign

Sigstore Bundle as OCI Artifact

bdehamer opened this issue ยท 15 comments

Many of the Sigstore clients already have support for generating/verifying the protobuf bundle, but adding this support to tools like cosign and the policy-controller requires that we standardize on an approach for storing bundles in an OCI registry.

Proposal

The proposed approach for storing Sigstore bundles in an OCI registry is to follow the guidelines for artifact usage in the OCI image spec.

Publishing

First, the bundle itself is stored in it's JSON-serialized form as a blob in the registry

POST /v2/foo/blobs/uploads/?digest=cafed00d...
Content-Type: application/octet-stream

{"mediaType":"application/vnd.dev.sigstore.bundle+json;version=0.2", ...}

In this example โ€œfooโ€ is the name of the repository within the registry to which the artifact is being uploaded. The digest included as part of the POST is the hex-encoded SHA-256 digest of the raw bytes of the bundle itself.

Once the blob has been created, the next step is to create a manifest that associates the bundle blob with the image it describes:

PUT /v2/foo/manifests/sha256:badf00d...
Content-Type: application/vnd.oci.image.manifest.v1+json

{
  "mediaType": "application/vnd.oci.image.manifest.v1+json",
  "schemaVersion": 2,
  "artifactType": "application/vnd.dev.sigstore.bundle+json;version=0.2",
  "config": {
    "mediaType": "application/vnd.oci.empty.v1+json",
    "digest": "sha256:44136fa3...",
    "size": 2
  },
  "layers": [
    {
      "digest": "sha256:cafed00d...",
      "mediaType": "application/vnd.dev.sigstore.bundle+json;version=0.2",
      "size": 4971
    }
  ],
  "subject": {
    "digest": "sha256:c00010ff...",
    "mediaType": "application/vnd.oci.image.index.v1+json"
   }
}

The manifest has an artifactType field which identifies the type of the artifact being referenced -- in this case, it's the Sigstore bundle media type.

The layers point to one or more blobs that comprise the artifact itself. In this example there is a single layer which points to the blob containing the bundle itself (note that the referenced digest is the same used during the blob upload).

The subject field associates this artifact with some other artifact which already exists in this repository (in this case, an image with the digest c00010ff)

Since Sigstore bundles don't require any additional configuration data, the config field references the empty descriptor.

At this point, any registry which supports the referrers API will automatically associate this manifest with the listed subject and make it available in the referrers index for that subject.

If the registry DOES NOT support the referrers API, a referrers list will need to be manually created/updated using the referrers tag scheme.

PUT /v2/foo/manifests/sha256-c00010ff...
Content-Type: application/vnd.oci.image.index.v1+json

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:badf00d..",
      "size": 779,
      "artifactType": "application/vnd.dev.sigstore.bundle+json;version=0.2"
    }
  ]
}

This index is uploaded with a tag that references the digest of the image to which all of the listed artifacts are associated. Each of the items in the manifests collection points to some other related artifact.

Retrieval

When a client wants to locate Sigstore bundles which may be associated with a given image, they would first make a request to referrers API with the image's digest:

GET /v2/foo/referrers/sha256:c000100ff...

A 404 Not Found response indicates that the registry does not support the referrers API and the referrers tag scheme should be used as a fallback:

GET /v2/foo/manifests/sha256-c000100ff...

A 404 here would indicate that there are no artifacts associated with the image.

Assuming there are artifacts present, one of the two above calls will return an image index listing the artifacts which have been associated with the specified image:

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "manifests": [
    {
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "digest": "sha256:badf00d..",
      "size": 779,
      "artifactType": "application/vnd.dev.sigstore.bundle+json;version=0.2"
    }
  ]
}

From this the client can identify any Sigstore bundles by looking at the artifactType field.

Using the digest listed in the image index, the next step is to retrieve the manifest for the bundle:

GET /v2/foo/manifests/sha256:badf00d..
{
  "mediaType": "application/vnd.oci.image.manifest.v1+json",
  "schemaVersion": 2,
  "artifactType": "application/vnd.dev.sigstore.bundle+json;version=0.2",
  "config": {
    "mediaType": "application/vnd.oci.empty.v1+json",
    "digest": "sha256:44136fa3...",
    "size": 2
  },
  "layers": [
    {
      "digest": "sha256:cafed00d...",
      "mediaType": "application/vnd.dev.sigstore.bundle+json;version=0.2",
      "size": 4971
    }
  ],
  "subject": {
    "digest": "sha256:c00010ff...",
    "mediaType": "application/vnd.oci.image.index.v1+json"
   }
}

The final step is to use the digest from the first of the layers to retrieve the bundle blob:

GET /v2/foo/blobs/uploads/?digest=cafed00d...
{
  "mediaType": "application/vnd.dev.sigstore.bundle+json;version=0.2",
  "verificationMaterial": {...},
  "messageSignature": {...}
}

Bundle Identification

For any given image, there may be any number of referring artifacts. If there are multiple Sigstore bundles associated with an image (say a build provenance bundle and a signed SBOM bundle) it may be difficult to identify which artifact is which in the image index:

{
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "schemaVersion": 2,
  "manifests": [
    {
      "artifactType": "application/vnd.dev.sigstore.bundle+json;version=0.2",
      "digest": "sha256:facefeed",
      "mediaType": "application/vnd.oci.image.manifest.v1+json"
    },
    {
      "artifactType": "application/vnd.dev.sigstore.bundle+json;version=0.2",
      "digest": "sha256:d0d0caca",
      "mediaType": "application/vnd.oci.image.manifest.v1+json"
    },
  ]
}

In the example above there are two Sigstore bundles associated with some image but there is no way to distinguish between them without going through the process of downloading each one and inspecting the contents.

One approach we might employ to help disambiguate bundles is the use of annotations to surface additional information about the contents of the bundle. In situations where Sigstore is being used to sign an in-toto statement we could surface the statement predicate type as an annotation:

{
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "schemaVersion": 2,
  "manifests": [
    {
      "artifactType": "application/vnd.dev.sigstore.bundle+json;version=0.2",
      "digest": "sha256:facefeed",
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "annotations": {
        "dev.sigstore.bundle/predicateType": "https://slsa.dev/provenance/v1"
      }
    },
    {
      "artifactType": "application/vnd.dev.sigstore.bundle+json;version=0.2",
      "digest": "sha256:d0d0caca",
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "annotations": {
        "dev.sigstore.bundle/predicateType": "https://spdx.dev/Document/v2.3"
      }
    },
  ]
}

Now the purpose of each of the bundles listed in the image index is clear.

Related Issues

Hi folks, the proposal looks great so far!
Is the ability to sign OCI artifact, not just images (without the experimental flag) also considered here?

For example, if I create a trivy scan report and push it to the registry without using in-toto (trivy docs use intoto) , will I be able to sign them with Cosign?

@vishal-chdhry This proposal is really only focused on how we would store a Sigstore bundle in an OCI registry. It will apply regardless of what is contained in the bundle itself (image signature, SBOM, scan report, etc.).

I can't speak to what support may be added to cosign for specific artifact types, but the bundle format itself is flexible enough to encapsulate pretty much any artifact you'd like to attest.

Thanks so much for exploring this and starting the conversation, I'm very excited to see this kick off.

I'm not very familiar with OCI, so some of these questions will be a bit high level and won't get into the nitty gritty details of implementation. I'd love for @hectorj2f @vaikas @jonjohnsonjr @dlorenc @priyawadhwa to take a look at this too.

Have we confirmed that the major registries (namely the ones listed in https://github.com/sigstore/cosign?tab=readme-ov-file#registry-support) support OCI v1.1, particularly the referring API? Or do we expect that most will rely on the fallback Referrers Tag Schema?

Have we also explored the use of annotations rather than the reference types, attached to the .sig or .att tags? The motivation for this would be a) it'll be far simpler to add to the existing codebase, and b) if registries lack support for OCI v1.1. Of course this is continuing to build on a non-standard solution. Another (non-standard) alternative is to add another tag type like .bundle, but given the existing tags, if we were to go with annotations, I'd lean towards leveraging existing tag type rather than add new tag types.
Edit: You mentioned size as a concern on Slack, have we seen registry or spec limitations for the size of an annotation that would prevent this?

For bundle identification, could we also have different artifact types for each use case, something like application/vnd.dev.sigstore.bundle+json;version=0.2;predicate=slsa/v1?

How to implement this is the next question. It seems like there is experimental support for OCI v1.1 added in #2684. Have you looked into this? I'd like to gauge how much work and changes to Cosign this proposal will involve.

Considering breaking changes: Bundle usage is the direction we want to go, but even with a major revision of Cosign, I don't want to drop support for verifying signed containers that leverage tags/annotations. Maybe in another major revision, but even then, I'm apprehensive to break verification for everything that's been signed before this release. If there's already experimental support for OCI v1.1 in the codebase, then maybe this isn't an issue and it'll be straightforward to continue to support both.

Have we confirmed that the major registries (namely the ones listed in https://github.com/sigstore/cosign?tab=readme-ov-file#registry-support) support OCI v1.1, particularly the referring API? Or do we expect that most will rely on the fallback Referrers Tag Schema?

I've done testing with a bunch of these registries to check compatibility with the proposed scheme. Here are all of the registries I've found to have full OCI Image Manifest v1.1 support (including the artifactType and subject fields) and will work either the Referrers API or the Referrers Tag Schema:

  • GitHub Container Registry - Referrers Tag Schema
  • Azure Container Registry - Referrers API
  • Docker Hub - Referrers Tag Schema
  • GCP Artifact Registry - Referrers Tag Schema
  • Zot - Referrers API
  • CNCF distribution/distribution Registry - Referrers Tag Schema
  • CNCF Harbor - Referrers API
  • Digital Ocean Container Registry - Referrers Tag Schema
  • JFrog Artifactory Container Registry - Referrers Tag Schema
  • RedHat quay.io - Referrers Tag Schema

The following registries have some compatibility issues:

  • AWS Elastic Container Registry (ECR) - Supports OCI Image Manifest v1.1 and Referrers Tag Schema. Rejects the the Sigstore Bundle media type when used in the image manifest (see aws/containers-roadmap#2306).
  • Alibaba Cloud Container Registry - Partial OCI Image Manifest v1.1 support -- rejects the application/vnd.oci.empty.v1+json media type, but works otherwise.
  • CloudSmith Container Registry - Supports OCI Image Manifest v1.1 and Referrers Tag Schema. Only issue I encountered was when uploading the vnd.oci.image.index.v1+json for the referrers tag -- the registry insisted that the manifest descriptor contain a platform field (which in turn was required to specify values for both os and architecture) which doesn't really make any sense in this context.

Have we also explored the use of annotations rather than the reference types, attached to the .sig or .att tags?

Beyond propagating this non-standard solution, the other thing which concerns me here is possibility of bumping into size constraints imposed on the image manifest. SBOM attestations can easily result in multi-megabyte bundles. The OCI spec has the following to say:

A registry SHOULD enforce some limit on the maximum manifest size that it can accept. A registry that enforces this limit SHOULD respond to a request to push a manifest over this limit with a response code 413 Payload Too Large. Client and registry implementations SHOULD expect to be able to support manifest pushes of at least 4 megabytes.

I haven't tested any of the registries to see what the various limits may be, but given that the blob-upload endpoints are already designed to accept arbitrarily large payloads (with support for things like chunked upload), this seems like the safer approach.

Another thing to note is that going with a standards-based approach like this (as opposed to the annotations or custom tags) also gives us the ability to use 3rd-party tooling like the oras CLI to push/pull Sigstore bundles:

If I have already assembled a Sigstore Bundle ("bundle.json") I can associate it with the image "bdehamer/foo" with the following:

oras attach \
  --artifact-type "application/vnd.dev.sigstore.bundle+json;version=0.2" \
  index.docker.io/bdehamer/foo:latest \
  bundle.json

Similarly, I can see what artifacts are attached to my image with the following:

oras discover index.docker.io/bdehamer/foo:latest
Discovered 2 artifacts referencing latest
Digest: sha256:87c9e1d785e7da6349c71f2726e5f7eb2da8800c800181139734816594c97331

Artifact Type                                          Digest
application/vnd.dev.sigstore.bundle+json;version=0.2   sha256:77b80fd375e6d211115ad30107bbc85746f85b083989c724f81855ace728ecfe
application/vnd.dev.sigstore.bundle+json;version=0.2   sha256:7a5cd27a2cc2d0b5f7db97646a004cef6d696379486153d1965dcd82076230bc

The oras CLI implements exactly the same upload/download schemes I described above.

I've done testing with some of those registries:

That's pretty decent support. I do wonder if we should first focus on using the referrers tag and skip the referring api work since there is little support for the latter.

Beyond propagating this non-standard solution, the other thing which concerns me here is possibility of bumping into size constraints imposed on the image manifest. SBOM attestations can easily result in multi-megabyte bundles

We'll also run into the size issue if we attach multiple SBOM attestations to a single tag, which I believe we do now when re-signing the same container.

Also cc @imjasonh too to take a look

I do wonder if we should first focus on using the referrers tag and skip the referring api work since there is little support for the latter.

The incremental work to support the the referrers API on top of the referrers tag scheme is minimal so there's not much to be saved by supporting one but not the other.

On the publishing side, there's really no extra work at all -- if you see the right HTTP header returned when uploading the artifact manifest, you know you're done (no need to create/update the index).

This is great work @bdehamer โœจ , based on my knowledge of OCI this is a very solid approach.

For bundle identification, could we also have different artifact types for each use case, something like application/vnd.dev.sigstore.bundle+json;version=0.2;predicate=slsa/v1?

I like what @haydentherapper mentioned here. Definitely I wouldn't rely on having to download all the bundles to know what contains each one :/.

The incremental work to support the the referrers API on top of the referrers tag scheme is minimal so there's not much to be saved by supporting one but not the other.

It sounds good to me as long as we are not locked to a single one approach.

Strong work in getting all this written down. Everything else is solid, and I'd like to sort the how to easily figure out which bundle to pull during verification which is certainly the most common use case. You attest once and verify multiple times, so just to overstate the obvious, we don't want to have to pull all the bundles before verification. Baking it into the mediaType seems little wrong to me because it's starting to leak a bit what's in the attestation. I don't really feel strongly against it, just a gut says it's a slippery slope. Also using the annotations would be more flexible in a sense that I'd reckon there's going to be more stuff there later on.
And yeah, stuffing the whole attestation into the annotation seems wrong due to size. But who's ever going to need more than 64k, amiright??

For any given image, there may be any number of referring artifacts. If there are multiple Sigstore bundles associated with an image (say a build provenance bundle and a signed SBOM bundle) it may be difficult to identify which artifact is which in the image index

So, we have two proposals for differentiating between bundles in the referrers index:

  1. Use an annotation to surface the predicate of the in-toto statement contained within the bundle:

    {
      "mediaType": "application/vnd.oci.image.index.v1+json",
      "schemaVersion": 2,
      "manifests": [
        {
          "artifactType": "application/vnd.dev.sigstore.bundle+json;version=0.2",
          "digest": "sha256:facefeed",
          "mediaType": "application/vnd.oci.image.manifest.v1+json",
          "annotations": {
            "dev.sigstore.bundle/predicateType": "https://slsa.dev/provenance/v1"
          }
        },
        {
          "artifactType": "application/vnd.dev.sigstore.bundle+json;version=0.2",
          "digest": "sha256:d0d0caca",
          "mediaType": "application/vnd.oci.image.manifest.v1+json",
          "annotations": {
            "dev.sigstore.bundle/predicateType": "https://spdx.dev/Document/v2.3"
          }
        },
      ]
    }
  2. Tack the predicate on as a parameter on the artifact type:

    {
      "mediaType": "application/vnd.oci.image.index.v1+json",
      "schemaVersion": 2,
      "manifests": [
        {
          "artifactType": "application/vnd.dev.sigstore.bundle+json;version=0.2;predicate=https://slsa.dev/provenance/v1",
          "digest": "sha256:facefeed",
          "mediaType": "application/vnd.oci.image.manifest.v1+json"
        },
        {
          "artifactType": "application/vnd.dev.sigstore.bundle+json;version=0.2;predicate=https://spdx.dev/Document/v2.3",
          "digest": "sha256:d0d0caca",
          "mediaType": "application/vnd.oci.image.manifest.v1+json"
        },
      ]
    }

I'd be interested in any other proposals people may have or endorsements for one of the two options above.

I have a slight preference in favor of using an annotation to specify the predicate type. It seems cleaner not to mash more metadata into the artifactType (there's a slippery slope there), complicates parsing slightly, and seems to set a precedent for adding the predicate type to the bundle mediaType, which is redundant and opens questions about requiring clients to verify that the predicateType in the bundle matches the mediaType. Maybe that's worth it, but I would open that question in protobuf-specs if there is more demand for it outside of this use-case. In the context of cosign/policy-controller, the simplest thing to do IMO is just add the annotation like in your proposal. Open to other's perspectives on this though!

I'm supportive of using the annotation. Anyone else have any suggestions?

Overall, LGTM on this proposal. Thanks so much for getting this started! Do you want to submit this as a spec in the cosign/specs folder?

One approach we might employ to help disambiguate bundles is the use of annotations to surface additional information about the contents of the bundle. In situations where Sigstore is being used to sign an in-toto statement we could surface the statement predicate type as an annotation:

{
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "schemaVersion": 2,
  "manifests": [
    {
      "artifactType": "application/vnd.dev.sigstore.bundle+json;version=0.2",
      "digest": "sha256:facefeed",
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "annotations": {
        "dev.sigstore.bundle/predicateType": "https://slsa.dev/provenance/v1"
      }
    },
    {
      "artifactType": "application/vnd.dev.sigstore.bundle+json;version=0.2",
      "digest": "sha256:d0d0caca",
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "annotations": {
        "dev.sigstore.bundle/predicateType": "https://spdx.dev/Document/v2.3"
      }
    },
  ]
}

This looks like a good high level approach from the view of what we were intending from OCI. We were considering what annotations consumers might want to include for filtering a list of results, and a common idea was the identity of the signer, along with the creation date, so that tooling could quickly locate the most recent signature from a security team. I'd consider it from the perspective of the consumer tooling, what they need to quickly find what they need.

"artifactType": "application/vnd.dev.sigstore.bundle+json;version=0.2"

This may not technically be correct in the spec. The field needs to follow the naming requirements in RFC 6838 Section 4.2, and parameters are an addition that isn't part of the media type itself. Registries that validate the field may reject the manifest. The way OCI has done this in other places is to include the version in the media type, e.g. something like: application/vnd.dev.sigstore.bundle.v0.2+json.

You mentioned size as a concern on Slack, have we seen registry or spec limitations for the size of an annotation that would prevent this?

The suggestion is to keep manifests below 4MiB. Since the referrers response is a collection of descriptors with the annotations and artifactType values pulled up, and registries want to avoid too much pagination on the API (pagination that isn't supported with the fallback tag), I'm hearing a rough limit of 40kb of annotation content before some registries may start to reject the manifest push. That allows 100 descriptors to be included per page of the referrers response without exceeding the 4MiB limit.

That's pretty decent support. I do wonder if we should first focus on using the referrers tag and skip the referring api work since there is little support for the latter.

This was an approach used by another project and they are now stuck creating the fallback tag on registries that support the referrers API to avoid breaking older clients. They'll see compatibility issues with other referrers aware tooling that may copy content across registries without the fallback tag. So I agree with the suggestion to use the API as the spec recommends. There's already going to be one transition for tooling from the existing sha256-xxxxx.att tags to the new spec, you don't want to make that two transitions. For the transition, you typically want to upgrade clients doing the consuming first. And once they are all updated, transition the clients producing the attestations, perhaps with a flag that changes the default value from "push with the sha256-xxx.att tag" to "push using the referrers API".