anchore/syft

SBOM cataloger silently discards CycloneDX components of other types than library/application

Opened this issue · 4 comments

What happened:

Given a simple CycloneDX SBOM:

{
  "bomFormat": "CycloneDX",
  "specVersion": "1.6",
  "components": [
    {
      "type": "library",
      "name": "somelib"
    },
    {
      "type": "device-driver",
      "name": "somedriver"
    },
    {
      "type": "platform",
      "name": "someplatform"
    },
    {
      "type": "application",
      "name": "someapp"
    }
  ]
}

And running it through SBOM cataloger:

syft scan file:./test.cdx.json --output=json --select-catalogers "+sbom-cataloger" 

What you expected to happen:

I expected a SBOM containing all four components. But the resulting SBOM has only "someapp" and "somelib", and the other two ("someplatform" and "somedriver") are nowhere to be seen. It looks like this happens to all other CycloneDX types except "application" and "library".

Steps to reproduce the issue:

Anything else we need to know?:

Environment:

  • Output of syft version:
Application: syft
Version:    1.16.0
BuildDate:  2024-11-04T22:29:33Z
GitCommit:  8a41d772509d37267a65e0b425808e883e4b9dce
GitDescription: v1.16.0
Platform:   darwin/arm64
GoVersion:  go1.22.8
Compiler:   gc

  • OS (e.g: cat /etc/os-release or similar): MacOS 14.7.1

Thanks for the issue @pasieronen! I added needs-discussion here since syft currently does not surface device-drivers or platform components.

Here is the code where we drop the components on decode:

func collectPackages(component *cyclonedx.Component, s *sbom.SBOM, idMap map[string]interface{}) {
switch component.Type {
case cyclonedx.ComponentTypeOS:
case cyclonedx.ComponentTypeContainer:
case cyclonedx.ComponentTypeApplication, cyclonedx.ComponentTypeFramework, cyclonedx.ComponentTypeLibrary:
p := decodeComponent(component)
idMap[component.BOMRef] = p
syftID := extractSyftPacakgeID(component.BOMRef)
if syftID != "" {
idMap[syftID] = p
}
// TODO there must be a better way than needing to call this manually:
p.SetID()
s.Artifacts.Packages.Add(*p)
}
if component.Components != nil {
for i := range *component.Components {
collectPackages(&(*component.Components)[i], s, idMap)
}
}
}

I think we'll take a look this week on if we should allow these to pass through and not be dropped when doing the SBOM cataloger or convert functionality.

The core question is: "Can/should the syft json be able to represent these packages?"

syft convert (a different path than described here) also hints at being able to have the native spdx.* and cyclonedx.* lib core objects directly on sbom.SBOM. This way, say via the cataloger path, we could persist those "native" objects into something that the formatters can make direct use of.

I think this is one of many options here (hints the needs-investigation label).

I'm going to spike on this and see if we can design a way to adapt packages coming into the syft SBOM that we don't know how to catalog.

There are a couple of ways we discussed on the live-stream today, but all surround adding some kind of sidecar or metadata that is specific to the format models @ onto the core syft package coming from the SBOM cataloger so that we don't drop information on the floor when the SBOM cataloger is used.