Support varying schema versions of SBOM outputs
ryanmoran opened this issue · 3 comments
Problem
Relying upon the SBOM formatting code available in syft
is creating inadvertent breaking changes in our API contract. For example, we cannot currently upgrade the version of syft
that we are using without also changing the schema version of our CycloneDX SBOM output.
In order to stabilize the contract in the long run, we should be introduce functionality in packit/sbom
capable of generating SBOM format outputs in many different schema versions of each SBOM type. Doing so will allow us to keep the SBOM output, and our API, stable while still consuming the latest versions of the syft
library for scanning and generating the SBOM data.
Proposal
Our current implementation leverages the syft.Encode
method to convert a sbom.SBOM
into []bytes
that can be written to a file.
packit/sbom/formatted_reader.go
Lines 46 to 49 in ed29b63
The packit/sbom
package does not currently implement any of the formatting itself. Instead, it leverages the existing formats made available by syft
:
packit/sbom/formatted_reader.go
Lines 34 to 44 in ed29b63
Unfortunately, these formats are updated on a near-constant basis, and in a way that breaks backwards-compatibility in the SBOM output format contract.
Luckily, the syft
library has recently undergone a refactoring that now allows us to define our own format types that can be used to encode SBOM output. To implement a format, you need to create a concrete type that conforms to the sbom.Format
interface
type Format interface {
ID() FormatID
Encode(io.Writer, SBOM) error
Decode(io.Reader) (*SBOM, error)
Validate(io.Reader) error
}
We should implement a package internal to packit/sbom
that implements at least the following formats (all versions we have ever released support for through the existing sbom
package):
- CycloneDX JSON 1.3
- CycloneDX JSON 1.4
- Syft 2.0.0
- Syft 2.0.1
- Syft 2.0.2
- Syft 3.0.0
- Syft 3.0.1
- Syft 3.1.0
- SPDX JSON 2.2
We should feel free to reuse the format implementations in syft
for the cases they cover, but beware that these formats could skew away from the designated schema version at any point.
Choosing a format at build-time
The criteria for choosing a format will be to follow the specification of the sbom-formats
media type included in the buildpack.toml
. For example, with the following declaration, I should see formatted SBOM outputs using the Syft 3.0.1 schema.
sbom-formats = [ "application/vnd.syft+json;version=3.0.1" ]
The IANA hosts the specifications for each of the following media types:
As can be seen in these specifications, the Syft and CycloneDX formats allow for an optional version
parameter. We can parse this extra field to determine the specific schema version to use. SPDX does not currently outline a similar parameter, but there is an open issue requesting that feature: spdx/spdx-spec#642
If the version
parameter is omitted, the latest schema version for that SBOM type should be chosen.
Integration
Currently, the sbom-formats
field is limited in what it accepts as valid media-types. Specifically, it won't allow us to specify the extra version
parameters we wish to use. We can still implement and test this functionality in packit/sbom
, but won't be able to use it in real buildpacks until we see a resolution on this issue: buildpacks/lifecycle#828
@ryanmoran Depends on if we think it's worth adding more of the enumerated schema versions. My understanding was that syft 2.0.2, cyclonedx 1.3 and syft 3.0.1 represented a "good enough for now" set of supported schemas. Do you think we'll need to revisit which ones we support?