phylum-dev/vuln-reach

Consider handling duplicate main fields in package.json

Opened this issue · 1 comments

Description

Processing a package that has multiple main fields in package.json causes an error. As far as I know, packages are intended to have only one main field, but this is not enforced by the package manager or registry. There are some packages with multiple main fields which vuln-reach does not process.

For example, browserify-zlib@0.1.4 has this package.json:

{
  "name": "browserify-zlib",
  "version": "0.1.4",
  "description": "Full zlib module for browserify",
  "keywords": ["zlib", "browserify"],
  "main": "index.js",                          # <---- 1st main
  "directories": {
    "test": "test"
  },
  "dependencies": {
    "pako": "~0.2.0"
  },
  "devDependencies": {
    "tape": "^2.12.3",
    "brfs": "^1.0.1"
  },
  "testling": {
    "files": "test/*.js",
    "browsers": [
      "ie/6..latest",
      "chrome/22..latest",
      "firefox/16..latest",
      "safari/latest",
      "opera/11.0..latest",
      "iphone/6",
      "ipad/6",
      "android-browser/latest"
    ]
  },
  "scripts": {
    "test": "node_modules/tape/bin/tape test/*.js"
  },
  "main": "src/index.js",                     # <---- 2nd main
  "author": "Devon Govett <devongovett@gmail.com>",
  "license": "MIT",
  "repository": {
    "type": "git",
    "url": "git://github.com/devongovett/browserify-zlib.git"
  }
}

JS runtimes can import browserify-zlib@0.1.4 just fine. This particular case is easy to resolve, because the file referenced in the first main field does not exist. Maybe the CommonJS / esm spec define how to handle multiple main fields in a more general way. It might be worthwhile to implement the same resolution mechanism that JS runtimes use.

Expected Behavior

vuln-reach performs reachability analysis on projects that have browserify-zlib has a dependency

Actual Behavior

Config: https://gitlab.com/-/snippets/3623569

$ vuln-reach-cli ./config.toml

    Reachability for @gitlab/ui:70.0.0

Error: "./tarballs/browserify-zlib-0.1.4.tgz": Generic("package.json deserialization error: Error(\"duplicate field `main`\", line: 33, column: 8)")

Hi, thank you for reporting this!

The JSON standard explicitly refrains from providing guidance in this case:

The JSON syntax does not impose any restrictions on the strings used as names, does not require that name strings be unique, and does not assign any significance to the ordering of name/value pairs. These are all semantic considerations that may be defined by JSON processors or in specifications defining specific uses of JSON for data interchange.

It appears the implementation of serde_json chose to error out, which I believe is reasonable: most data structures that could possibly deserialize from a JSON dictionary have unique keys, and no commonly accepted strategy for exceptions is defined, so I think the assumption that non-unique keys on the serialized side are an error makes sense.

Besides, this instance of key duplication was an error that npm silently accepted at the moment the package was published.

That being said, I assume that those JSONs are parsed via JSON.parse which seemingly keeps the last key. I can look into its spec to validate that, and then look into whether serde_json allows workarounds for this case.