anchore/syft

Ability to add custom SBOM components as files within Docker image

Opened this issue · 6 comments

What would you like to be added:

In this blog:

https://www.docker.com/blog/generate-sboms-with-buildkit/

it is implied that Syft allows custom SBOM entries to be added for file that are manually added to an image and not through a package manager. The implicatication is that it should be possible to add details of manually added files and that Syft will pick these up during its scan of the image. I cann get this to work as described in the blog (add heredoc to image in Dockerfile).

Why is this needed:

It would be very useful to define a way to add custom annotations in such a way to ensure SBOM is complete

Additional context:

👋 Thanks @nycnewman for the question!

In its current form the build-kit scanner plugin used in the blog post is not able to pick up on supplemental SBOMs.

It was previously able to do this when it used syft v0.98.0 because the SBOM cataloger was enabled by default.

When they upgraded tov0.105.0 this default support was dropped.

We this is the line that should be updated:
https://github.com/docker/buildkit-syft-scanner/blob/f22f9866c24554ba087f5798d03a41c39f3e5f7b/internal/target.go#L49

syft.CreateSBOM(context.Background(), nil, syft.DefaultCreateSBOMConfig().
	WithCatalogerSelection(pkgcataloging.NewSelectionRequest().
		WithDefaults(pkgcataloging.ImageTag).
+ 			WithAdditions("sbom-cataloger"),
	),
)

Note the new call: WithAdditions("sbom-cataloger"), That should reestablish the previous behavior for when the blog post was written.

How to achieve merging those SBOM outside of the plugin

Current state: When you have an image as described in the blog post where you've supplemented an image with a build, and had the build generate the SBOM for you, then syft has a cataloger that can be enabled to detect and merge this document with the final results.

Enable via the CLI

syft <your-image> --select-catalogers "+sbom-cataloger"

When you have the final image, and both the supplemental and generated SBOM are in the image, syft can be used with the above options to achieve the previously seen behavior by scanning the image directly.

Thank you for the quick response. I had tried that without success, though with a minor change.

  • I used the following test Dockerfile
FROM nginx:latest

#RUN dpkg --purge libpython3.9-minimal
#RUN dpkg --purge python3.9-minimal

RUN /bin/apt install curl

RUN curl -L https://github.com/fullstorydev/grpcurl/releases/download/v1.9.1/grpcurl_1.9.1_linux_arm64.tar.gz | tar xvfz - && \
    mv grpcurl /usr/local/bin/

COPY <<EOF foo.sbom.json
{
    "spdxVersion": "SPDX-2.3",
    "SPDXID": "SPDXRef-DOCUMENT",
    "name": "grpcurl-v1.2.3"
}
EOF
  • I built in Docker using "docker build -t test ."
  • Then I ran
    syft -vv test --select-catalogers "+sbom-cataloger"

It did not pick up the sbom. Don;t see anything in the verbose logging that it sees the file.

What is the reason for needing to add sbom to image and rerun a second time to consolidate?

The debug messages:

[0002] DEBUG SBOM cataloger reader is not a ReadSeeker, reading entire SBOM into memory
[0002] DEBUG discovered 0 packages cataloger=sbom-cataloger
[0002] INFO task completed elapsed=2.9425ms task=sbom-cataloger

I also changed SPDX to be a more complete example and this made no difference

(Used this as example: https://github.com/swinslow/spdx-examples/blob/master/example7/spdx/example7-bin.spdx.json)

Hi @nycnewman,

I think the snippet:

COPY <<EOF foo.sbom.json
{
    "spdxVersion": "SPDX-2.3",
    "SPDXID": "SPDXRef-DOCUMENT",
    "name": "grpcurl-v1.2.3"
}
EOF

Does not make an spdx SBOM with any packages in it.

In order to get the SBOM cataloger to pick up the package, I did this:

at ./spdx.json

{
 "spdxVersion": "SPDX-2.3",
 "dataLicense": "CC0-1.0",
 "SPDXID": "SPDXRef-DOCUMENT",
 "name": "localhost/syft3490",
 "documentNamespace": "https://anchore.com/syft/image/localhost/syft3490-ca4ea7de-a4eb-4c91-8227-1335ee64325f",
 "creationInfo": {
  "licenseListVersion": "3.25",
  "creators": [
   "Organization: Anchore, Inc",
   "Tool: syft-1.17.0"
  ],
  "created": "2024-12-04T15:45:17Z"
 },
 "packages": [
  {
   "name": "WILLL",
   "SPDXID": "SPDXRef-Package-deb-adduser-0d50d654eb648ebd",
   "versionInfo": "3.134",
   "supplier": "NOASSERTION",
   "downloadLocation": "NOASSERTION",
   "filesAnalyzed": true,
   "packageVerificationCode": {
    "packageVerificationCodeValue": "ee259e59ebc5bf49005492c1a393d32158491196"
   },
   "sourceInfo": "acquired package info from DPKG DB: /usr/share/doc/adduser/copyright, /var/lib/dpkg/info/adduser.conffiles, /var/lib/dpkg/info/adduser.list, /var/lib/dpkg/info/adduser.md5sums, /var/lib/dpkg/info/adduser.postrm, /var/lib/dpkg/info/adduser.preinst, /var/lib/dpkg/status",
   "licenseConcluded": "NOASSERTION",
   "licenseDeclared": "GPL-2.0-only AND GPL-2.0-or-later",
   "copyrightText": "NOASSERTION",
   "externalRefs": [
    {
     "referenceCategory": "SECURITY",
     "referenceType": "cpe23Type",
     "referenceLocator": "cpe:2.3:a:adduser:adduser:3.134:*:*:*:*:*:*:*"
    },
    {
     "referenceCategory": "PACKAGE-MANAGER",
     "referenceType": "purl",
     "referenceLocator": "pkg:deb/debian/adduser@3.134?arch=all&distro=debian-12"
    }
   ]
  }
 ],
 "files": [],
 "hasExtractedLicensingInfos": [],
 "relationships": []
}

At ./Dockerfile:

FROM nginx:latest

#RUN dpkg --purge libpython3.9-minimal
#RUN dpkg --purge python3.9-minimal

RUN /bin/apt install curl

RUN curl -L https://github.com/fullstorydev/grpcurl/releases/download/v1.9.1/grpcurl_1.9.1_linux_arm64.tar.gz | tar xvfz - && \
  mv grpcurl /usr/local/bin/

COPY ./spdx.json /stuff.spdx.json

This indeed makes a package called WILLL show up in Syft's output.

As for what changed from the blog post, it looks like buildkit bumped Syft to a version where the SBOM cataloger is off by default, and so a PR is needed there to turn the SBOM cataloger on in their invocation of it.

Dev notes: the next step is to PR buildkit to change the config where Syft is invoked. Adding to "Ready" column.

Thank you! That worked. It would be useful to have this documented somewhere as I had to read through the code base to figure parts of this out. Wasn't clear what the SBOM format needed to be, i.e. does this only support SPDX (though I assume it would support a CDx record as well....)?

Added issue to buildkit-plugin-syft for re-enabling SBOM cataloguer feature: docker/buildkit-syft-scanner#111