Consider handling duplicate main fields in package.json
Opened this issue · 1 comments
Description
Processing a package that has multiple main
fields in package.json causes an error. As far as I know, packages are intended to have only one main
field, but this is not enforced by the package manager or registry. There are some packages with multiple main fields which vuln-reach does not process.
For example, browserify-zlib@0.1.4
has this package.json:
{
"name": "browserify-zlib",
"version": "0.1.4",
"description": "Full zlib module for browserify",
"keywords": ["zlib", "browserify"],
"main": "index.js", # <---- 1st main
"directories": {
"test": "test"
},
"dependencies": {
"pako": "~0.2.0"
},
"devDependencies": {
"tape": "^2.12.3",
"brfs": "^1.0.1"
},
"testling": {
"files": "test/*.js",
"browsers": [
"ie/6..latest",
"chrome/22..latest",
"firefox/16..latest",
"safari/latest",
"opera/11.0..latest",
"iphone/6",
"ipad/6",
"android-browser/latest"
]
},
"scripts": {
"test": "node_modules/tape/bin/tape test/*.js"
},
"main": "src/index.js", # <---- 2nd main
"author": "Devon Govett <devongovett@gmail.com>",
"license": "MIT",
"repository": {
"type": "git",
"url": "git://github.com/devongovett/browserify-zlib.git"
}
}
JS runtimes can import browserify-zlib@0.1.4
just fine. This particular case is easy to resolve, because the file referenced in the first main field does not exist. Maybe the CommonJS / esm spec define how to handle multiple main fields in a more general way. It might be worthwhile to implement the same resolution mechanism that JS runtimes use.
Expected Behavior
vuln-reach performs reachability analysis on projects that have browserify-zlib
has a dependency
Actual Behavior
Config: https://gitlab.com/-/snippets/3623569
$ vuln-reach-cli ./config.toml
Reachability for @gitlab/ui:70.0.0
Error: "./tarballs/browserify-zlib-0.1.4.tgz": Generic("package.json deserialization error: Error(\"duplicate field `main`\", line: 33, column: 8)")
Hi, thank you for reporting this!
The JSON standard explicitly refrains from providing guidance in this case:
The JSON syntax does not impose any restrictions on the strings used as names, does not require that name strings be unique, and does not assign any significance to the ordering of name/value pairs. These are all semantic considerations that may be defined by JSON processors or in specifications defining specific uses of JSON for data interchange.
It appears the implementation of serde_json
chose to error out, which I believe is reasonable: most data structures that could possibly deserialize from a JSON dictionary have unique keys, and no commonly accepted strategy for exceptions is defined, so I think the assumption that non-unique keys on the serialized side are an error makes sense.
Besides, this instance of key duplication was an error that npm
silently accepted at the moment the package was published.
That being said, I assume that those JSONs are parsed via JSON.parse
which seemingly keeps the last key. I can look into its spec to validate that, and then look into whether serde_json
allows workarounds for this case.