"bom-ref" field shows escaped characters for special chars
Closed this issue · 2 comments
We are running syft to generate sbom files with cyclonedx-json@1.5 output option.
We are seeing escaped characters in some sections of the SBOM structure like "bom-ref" and others.
When the SBOM.json is created, the Contents and its bom-ref information shows escaped characters.
Is this the expected behavior? I tried to search for open issues, tried to update my syft version, check charset, etc.
I was hoping to see the correct module/package name.
See how it looks like when the SBOM is created. Check keys bom-ref
, cpe
and purl
shows escaped and different "patterns" to generate the output, since the name
shows it properly.
"components": [
{
"bom-ref": "pkg:npm/%40aashutoshrathi/word-wrap@1.2.6?package-id=cf52c618c3862994",
"type": "library",
"name": "@aashutoshrathi/word-wrap",
"version": "1.2.6",
"cpe": "cpe:2.3:a:\\@aashutoshrathi\\/word-wrap:\\@aashutoshrathi\\/word-wrap:1.2.6:*:*:*:*:*:*:*",
"purl": "pkg:npm/%40aashutoshrathi/word-wrap@1.2.6",
Steps to reproduce the issue:
Consider this package-lock.json
, noticed the module_name
in the section:
"node_modules/@aashutoshrathi/word-wrap": {
"version": "1.2.6",
"resolved": "https://<REMOVED>/@aashutoshrathi/word-wrap/-/word-wrap-1.2.6.tgz",
"integrity": "sha512-1Yjs2SvM8TflER/OD3cOjhWWOZb58A2t7wpE2S9XfBYTiIl+XFhQG2bjy4Pu1I+EAlCNUzRDYDdFwFYUKvXcIA==",
"dev": true,
"engines": {
"node": ">=0.10.0"
}
},
Command:
cd <repository_folder>
./syft . -o cyclonedx-json@1.5=testsbom2.json
Output:
{
"$schema": "http://cyclonedx.org/schema/bom-1.5.schema.json",
"bomFormat": "CycloneDX",
"specVersion": "1.5",
"serialNumber": "urn:uuid:4894d768-3c79-4d7d-b06d-f8dfdc615a4b",
"version": 1,
"metadata": {
"timestamp": "2024-10-24T20:04:40Z",
"tools": {
"components": [
{
"type": "application",
"author": "anchore",
"name": "syft",
"version": "1.14.2"
}
]
},
"component": {
"bom-ref": "5054cfdceff1cb35",
"type": "file",
"name": "<repository_folder>"
}
},
"components": [
{
"bom-ref": "pkg:npm/%40aashutoshrathi/word-wrap@1.2.6?package-id=cf52c618c3862994",
"type": "library",
"name": "@aashutoshrathi/word-wrap",
"version": "1.2.6",
"cpe": "cpe:2.3:a:\\@aashutoshrathi\\/word-wrap:\\@aashutoshrathi\\/word-wrap:1.2.6:*:*:*:*:*:*:*",
"purl": "pkg:npm/%40aashutoshrathi/word-wrap@1.2.6",
"properties": [
Environment:
syft --version
syft 1.14.2
cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
Hi @dszortyka, it looks like the character @
is being replaced by %40
in the PURL, e.g. a package with this name: @aashutoshrathi/word-wrap
results in a PURL like this: pkg:npm/%40aashutoshrathi/word-wrap@1.2.6
and a similar encoding is happening in the bom-ref
, which uses the PURL. The PackageURL spec is pretty clear about @
being a special character which requires encoding as %40
when used in locations other than the version separator, explicitly stating: "the '@' version separator must be encoded as %40 elsewhere". I believe Syft is doing the correct thing here, but do you think there is something else Syft should do?