phylum-dev/vuln-reach

Reachability analysis failed for jest-environment-jsdom@28.1.3 and tough-cookie@4.0.0

Opened this issue · 8 comments

Description

I ran vuln-reach-cli for package jest-environment-jsdom@28.1.3 which depends on tough-cookie@4.0.0. This version has the following vulnerability: salesforce/tough-cookie@12d4747?diff=split

I specified in the config file a few of the code locations that were patched , but vuln-reach-cli reports an error for all locations that I tried.

Reproduction Steps

Create a config

# config.toml
[[projects]]
name = "jest-environment-jsdom:28.1.3"
tarballs = "./tarballs"
packages = [
  { name = "jest-environment-jsdom", version = "28.1.3" },
  { name = "jsdom", version = "19.0.0" },
  { name = "tough-cookie", version = "4.0.0"}
]
vuln = [
  { package = "tough-cookie", module = "lib/memstore.js", start_row = 111, start_column = 32, end_row = 34, end_column = 111 }
]

Compile vuln-reach-cli and call it with the config from the previous section.

./vuln-reach-cli config.toml

Expected Behavior

Reachability analysis does not fail.

Actual Behavior

(Note the line staring with Reachability failed)

    Reachability for jest-environment-jsdom:28.1.3

Package spec not found in project: psl
Package spec not found in project: universalify
Package spec not found in project: util
Package spec not found in project: punycode
Package spec not found in project: url
Package spec not found in project: whatwg-url
Package spec not found in project: stream
Package spec not found in project: acorn-globals
Package spec not found in project: saxes
Package spec not found in project: decimal.js
Package spec not found in project: nwsapi
Package spec not found in project: is-potential-custom-element-name
Package spec not found in project: whatwg-mimetype
Package spec not found in project: w3c-xmlserializer
Package spec not found in project: events
Package spec not found in project: escodegen
Package spec not found in project: path
Package spec not found in project: vm
Package spec not found in project: cssstyle
Package spec not found in project: acorn
Package spec not found in project: whatwg-encoding
Package spec not found in project: html-encoding-sniffer
Package spec not found in project: http
Package spec not found in project: https-proxy-agent
Package spec not found in project: https
Package spec not found in project: ws
Package spec not found in project: webidl-conversions
Reachability failed: Generic("All identifiers should have an access scope: {Node { (111, 32) - (111, 33)} {")
Package spec not found in project: fs
Package spec not found in project: cssom
Package spec not found in project: parse5
Package spec not found in project: child_process
Package spec not found in project: abab
Package spec not found in project: zlib
Package spec not found in project: http-proxy-agent
Package spec not found in project: canvas
Package spec not found in project: data-urls
Package spec not found in project: xml-name-validator
Package spec not found in project: form-data
Package spec not found in project: symbol-tree
Package spec not found in project: w3c-hr-time
Package spec not found in project: os
Package spec not found in project: domexception
Package spec not found in project: @jest/fake-timers
Package spec not found in project: jest-util
Package spec not found in project: jest-mock

  *** No paths to tough-cookie/lib/memstore.js:111:32 found.

Thank you for reporting this!

This uncovers a few issues on our end.

  • The .toml file should contain all transitive dependencies of the root package, i.e. those that would come out of a package.lock, rather than only the direct ones. This was not clear from our documentation, we should clarify that.

    You can use the Phylum CLI to retrieve the transitive dependencies from a lockfile:

    $ npm init -y && npm i --save jest-environment-jsdom@~28.1.3 tough-cookie@~4.0.0
    $ phylum parse package-lock.json
  • Due to how tree-sitter represents nodes, rows and columns are actually zero-indexed, so row 111 column 32 as shown in a text editor is actually represented as row 110 column 31 in vuln-reach. This is jarring; we should move to a 1-indexed format at least for the user-facing bits, as that's the most natural way of conceiving rows and columns.

  • The reported error is misleading. All identifiers should have an access scope: {Node { (111, 32) - (111, 33)} { points to a node that is actually an open bracket {; vuln-reach requires an identifier as a starting node, and the error should explicitly mention that instead.

  • In the configuration above, end_row and end_column are probably swapped:

    start_row = 111, start_column = 32, end_row = 34, end_column = 111
    

    we should validate these cases on our end and emit an error. To be valid, a node should have start_row <= end_row and start_column <= end_column.

@dappelt can you verify that this solved the issue for you?

@andreaphylum @louislang thanks for the quick response. There are still some problems.

I tested the new release with an adjusted configuration

[[projects]]
name = "jest-environment-jsdom:28.1.3"
tarballs = "./tarballs"
packages = [
  { name = "jest-environment-jsdom", version = "28.1.3" },
  { name = "jsdom", version = "19.0.0" },
  { name = "tough-cookie", version = "4.0.0" },
]
vuln = [
  # "vulnerable_package": "tough-cookie:4.0.0",
  # "advisory_url": "https://gitlab.com/gitlab-org/security-products/gemnasium-db/-/blob/master/npm/tough-cookie/CVE-2023-26136.yml"
  { package = "tough-cookie", module = "lib/memstore.js", start_row = 112, start_column = 33, end_row = 112, end_column = 35 },
]

Running vuln-reach with this config now returns the error:

Reachability failed: Generic("Node {Node { (111, 32) - (111, 33)} of kind { is not an identifier")

This seems to be the expected behavior with the changes that @andreaphylum described in his last comment:

The reported error is misleading. All identifiers should have an access scope: {Node { (111, 32) - (111, 33)} { points to a node that is actually an open bracket {; vuln-reach requires an identifier as a starting node, and the error should explicitly mention that instead.

I wonder if this behavior is correct for this vulnerability though.

In javascript the curly bracket has several uses; it indicates the start of a code block OR for object notation. The vulnerability location in the config is referencing the latter. The referenced vulnerable code is: this.idx[cookie.domain] = {};. To fix this prototype pollution vulnerability, the code was changed to this.idx[cookie.domain] = Object.create(null);, hence I defined the vulnerability location to reference {}, i.e. access to the object prototype.

Should this be a valid location? If not, how would you define the location for this example?

Should this be a valid location? If not, how would you define the location for this example?

It would not, because in the access graph that we use, vertices are identifier AST nodes and edges are a "A accesses B" relation between identifier AST nodes only.

For an arbitrary AST node kind to work as entry point, we should be able to define the "accesses" relation in a generalized fashion, which isn't really feasible due to the undecidable nature of runtime semantics. Maybe a workaround could be built for a few specific kind of nodes to be accepted as sources, but it would not be generalizable without significantly ramping up complexity and introducing hard to track edge cases.

It does make things counterintuitive, but I couldn't find a viable alternative unfortunately.


The referenced vulnerable code is: this.idx[cookie.domain] = {};. To fix this prototype pollution vulnerability, the code was changed to this.idx[cookie.domain] = Object.create(null);, hence I defined the vulnerability location to reference {}, i.e. access to the object prototype.

this and idx are both good starting node candidates, because they are in the same spot that is reached when the vulnerability is exposed, i.e. during the assignment expression.

Loosely, any identifier in that same scope would be a good candidate, as any identifier inside of the assignment's scope is marked reachable at the same time as the assignment is reached.

@andreaphylum Thanks for looking into this. I gave it another try with the identifiers that you suggested.

this and idx are both good starting node candidates

Config for identifier idx:

[[projects]]
name = "jest-environment-jsdom:28.1.3"
tarballs = "./tarballs"
packages = [
  { name = "jest-environment-jsdom", version = "28.1.3" },
  { name = "jsdom", version = "19.0.0" },
  { name = "tough-cookie", version = "4.0.0" },
]
vuln = [
  # "vulnerable_package": "tough-cookie:4.0.0",
  # "advisory_url": "https://gitlab.com/gitlab-org/security-products/gemnasium-db/-/blob/master/npm/tough-cookie/CVE-2023-26136.yml"
  { package = "tough-cookie", module = "lib/memstore.js", start_row = 112, start_column = 12, end_row = 112, end_column = 15 },
  # { ... }
]

Executing the config returns Reachability failed:

    Reachability for jest-environment-jsdom:28.1.3
[...]
Reachability failed: Generic("Node {Node property_identifier (111, 11) - (111, 14)} of kind property_identifier is not an identifier")`

I get a similar error for the identifier this. Any idea?

@dappelt my bad! I forgot that property identifiers don't actually belong to a scope and we don't have this semantics (they involve dynamic analysis) so neither one is actually the viable kind of identifier.

A good proxy would be the name of the function surrounding the statement, i.e. the foo node in the following code:

function foo() {
  this.idx[cookie.domain] = Object.create(null);
}

Even if the function definition could be not right next to the vulnerable node, the scope is what matters.

This could help in the future: you can paste JS code in the tree-sitter playground to determine which nodes have kind identifier and are thus viable. this and idx that I incorrectly recommended have, respectively, this and property_identifier kind.

immagine

Usability here could definitely be improved on our end, by having the first step of the algorithm find the closest identifier and uses that to kick off the rest of the process.

@andreaphylum Thanks for the clarifications. Checking for access to the function that contains vulnerability is a good compromise between the accuracy of the static analysis and the ease of defining vulnerability locations. But not all functions are of type identifier. In this example, putCookie is not defined with the function keyword but as a class member:

class MemoryCookieStore extends Store {
  putCookie(cookie, cb) {
    if (!this.idx[cookie.domain]) {
      this.idx[cookie.domain] = {};
    }
  }

This produces the following syntax tree:

image

putCookie is of type property_identifier. The only identifier near the vulnerable code is the function argument cookie. Since cookie is accessed in the same line as the vulnerable code {}, I used that as the vulnerability location which worked.

Yeah, unfortunately class methods aren't statically addressable as they end up as dynamic properties of class objects. cookie is a perfectly fine choice there, as it's going to target the same scope.