Reachability analysis failed for jest-environment-jsdom@28.1.3 and tough-cookie@4.0.0
Opened this issue · 8 comments
Description
I ran vuln-reach-cli
for package jest-environment-jsdom@28.1.3
which depends on tough-cookie@4.0.0
. This version has the following vulnerability: salesforce/tough-cookie@12d4747?diff=split
I specified in the config file a few of the code locations that were patched , but vuln-reach-cli
reports an error for all locations that I tried.
Reproduction Steps
Create a config
# config.toml
[[projects]]
name = "jest-environment-jsdom:28.1.3"
tarballs = "./tarballs"
packages = [
{ name = "jest-environment-jsdom", version = "28.1.3" },
{ name = "jsdom", version = "19.0.0" },
{ name = "tough-cookie", version = "4.0.0"}
]
vuln = [
{ package = "tough-cookie", module = "lib/memstore.js", start_row = 111, start_column = 32, end_row = 34, end_column = 111 }
]
Compile vuln-reach-cli
and call it with the config from the previous section.
./vuln-reach-cli config.toml
Expected Behavior
Reachability analysis does not fail.
Actual Behavior
(Note the line staring with Reachability failed
)
Reachability for jest-environment-jsdom:28.1.3
Package spec not found in project: psl
Package spec not found in project: universalify
Package spec not found in project: util
Package spec not found in project: punycode
Package spec not found in project: url
Package spec not found in project: whatwg-url
Package spec not found in project: stream
Package spec not found in project: acorn-globals
Package spec not found in project: saxes
Package spec not found in project: decimal.js
Package spec not found in project: nwsapi
Package spec not found in project: is-potential-custom-element-name
Package spec not found in project: whatwg-mimetype
Package spec not found in project: w3c-xmlserializer
Package spec not found in project: events
Package spec not found in project: escodegen
Package spec not found in project: path
Package spec not found in project: vm
Package spec not found in project: cssstyle
Package spec not found in project: acorn
Package spec not found in project: whatwg-encoding
Package spec not found in project: html-encoding-sniffer
Package spec not found in project: http
Package spec not found in project: https-proxy-agent
Package spec not found in project: https
Package spec not found in project: ws
Package spec not found in project: webidl-conversions
Reachability failed: Generic("All identifiers should have an access scope: {Node { (111, 32) - (111, 33)} {")
Package spec not found in project: fs
Package spec not found in project: cssom
Package spec not found in project: parse5
Package spec not found in project: child_process
Package spec not found in project: abab
Package spec not found in project: zlib
Package spec not found in project: http-proxy-agent
Package spec not found in project: canvas
Package spec not found in project: data-urls
Package spec not found in project: xml-name-validator
Package spec not found in project: form-data
Package spec not found in project: symbol-tree
Package spec not found in project: w3c-hr-time
Package spec not found in project: os
Package spec not found in project: domexception
Package spec not found in project: @jest/fake-timers
Package spec not found in project: jest-util
Package spec not found in project: jest-mock
*** No paths to tough-cookie/lib/memstore.js:111:32 found.
Thank you for reporting this!
This uncovers a few issues on our end.
-
The
.toml
file should contain all transitive dependencies of the root package, i.e. those that would come out of apackage.lock
, rather than only the direct ones. This was not clear from our documentation, we should clarify that.You can use the Phylum CLI to retrieve the transitive dependencies from a lockfile:
$ npm init -y && npm i --save jest-environment-jsdom@~28.1.3 tough-cookie@~4.0.0 $ phylum parse package-lock.json
-
Due to how
tree-sitter
represents nodes, rows and columns are actually zero-indexed, so row 111 column 32 as shown in a text editor is actually represented as row 110 column 31 invuln-reach
. This is jarring; we should move to a 1-indexed format at least for the user-facing bits, as that's the most natural way of conceiving rows and columns. -
The reported error is misleading.
All identifiers should have an access scope: {Node { (111, 32) - (111, 33)} {
points to a node that is actually an open bracket{
;vuln-reach
requires an identifier as a starting node, and the error should explicitly mention that instead. -
In the configuration above,
end_row
andend_column
are probably swapped:start_row = 111, start_column = 32, end_row = 34, end_column = 111
we should validate these cases on our end and emit an error. To be valid, a node should have
start_row <= end_row
andstart_column <= end_column
.
@andreaphylum @louislang thanks for the quick response. There are still some problems.
I tested the new release with an adjusted configuration
[[projects]]
name = "jest-environment-jsdom:28.1.3"
tarballs = "./tarballs"
packages = [
{ name = "jest-environment-jsdom", version = "28.1.3" },
{ name = "jsdom", version = "19.0.0" },
{ name = "tough-cookie", version = "4.0.0" },
]
vuln = [
# "vulnerable_package": "tough-cookie:4.0.0",
# "advisory_url": "https://gitlab.com/gitlab-org/security-products/gemnasium-db/-/blob/master/npm/tough-cookie/CVE-2023-26136.yml"
{ package = "tough-cookie", module = "lib/memstore.js", start_row = 112, start_column = 33, end_row = 112, end_column = 35 },
]
Running vuln-reach with this config now returns the error:
Reachability failed: Generic("Node {Node { (111, 32) - (111, 33)} of kind { is not an identifier")
This seems to be the expected behavior with the changes that @andreaphylum described in his last comment:
The reported error is misleading. All identifiers should have an access scope: {Node { (111, 32) - (111, 33)} { points to a node that is actually an open bracket {; vuln-reach requires an identifier as a starting node, and the error should explicitly mention that instead.
I wonder if this behavior is correct for this vulnerability though.
In javascript the curly bracket has several uses; it indicates the start of a code block OR for object notation. The vulnerability location in the config is referencing the latter. The referenced vulnerable code is: this.idx[cookie.domain] = {};
. To fix this prototype pollution vulnerability, the code was changed to this.idx[cookie.domain] = Object.create(null);
, hence I defined the vulnerability location to reference {}
, i.e. access to the object prototype.
Should this be a valid location? If not, how would you define the location for this example?
Should this be a valid location? If not, how would you define the location for this example?
It would not, because in the access graph that we use, vertices are identifier AST nodes and edges are a "A accesses B" relation between identifier AST nodes only.
For an arbitrary AST node kind to work as entry point, we should be able to define the "accesses" relation in a generalized fashion, which isn't really feasible due to the undecidable nature of runtime semantics. Maybe a workaround could be built for a few specific kind of nodes to be accepted as sources, but it would not be generalizable without significantly ramping up complexity and introducing hard to track edge cases.
It does make things counterintuitive, but I couldn't find a viable alternative unfortunately.
The referenced vulnerable code is:
this.idx[cookie.domain] = {};
. To fix this prototype pollution vulnerability, the code was changed tothis.idx[cookie.domain] = Object.create(null);
, hence I defined the vulnerability location to reference{}
, i.e. access to the object prototype.
this
and idx
are both good starting node candidates, because they are in the same spot that is reached when the vulnerability is exposed, i.e. during the assignment expression.
Loosely, any identifier in that same scope would be a good candidate, as any identifier inside of the assignment's scope is marked reachable at the same time as the assignment is reached.
@andreaphylum Thanks for looking into this. I gave it another try with the identifiers that you suggested.
this and idx are both good starting node candidates
Config for identifier idx
:
[[projects]]
name = "jest-environment-jsdom:28.1.3"
tarballs = "./tarballs"
packages = [
{ name = "jest-environment-jsdom", version = "28.1.3" },
{ name = "jsdom", version = "19.0.0" },
{ name = "tough-cookie", version = "4.0.0" },
]
vuln = [
# "vulnerable_package": "tough-cookie:4.0.0",
# "advisory_url": "https://gitlab.com/gitlab-org/security-products/gemnasium-db/-/blob/master/npm/tough-cookie/CVE-2023-26136.yml"
{ package = "tough-cookie", module = "lib/memstore.js", start_row = 112, start_column = 12, end_row = 112, end_column = 15 },
# { ... }
]
Executing the config returns Reachability failed
:
Reachability for jest-environment-jsdom:28.1.3
[...]
Reachability failed: Generic("Node {Node property_identifier (111, 11) - (111, 14)} of kind property_identifier is not an identifier")`
I get a similar error for the identifier this
. Any idea?
@dappelt my bad! I forgot that property identifiers don't actually belong to a scope and we don't have this
semantics (they involve dynamic analysis) so neither one is actually the viable kind of identifier.
A good proxy would be the name of the function surrounding the statement, i.e. the foo
node in the following code:
function foo() {
this.idx[cookie.domain] = Object.create(null);
}
Even if the function definition could be not right next to the vulnerable node, the scope is what matters.
This could help in the future: you can paste JS code in the tree-sitter playground to determine which nodes have kind identifier
and are thus viable. this
and idx
that I incorrectly recommended have, respectively, this
and property_identifier
kind.
![immagine](https://private-user-images.githubusercontent.com/68552656/286372354-c1730c93-3f00-40c0-9608-14e108dee8b7.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTk2NDEwNzEsIm5iZiI6MTcxOTY0MDc3MSwicGF0aCI6Ii82ODU1MjY1Ni8yODYzNzIzNTQtYzE3MzBjOTMtM2YwMC00MGMwLTk2MDgtMTRlMTA4ZGVlOGI3LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjI5VDA1NTkzMVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTE5YTNmMjI1N2U1ZWVjYTU3MGY3MDFmZGVkYzQyZGM5ZDQyYWMwYzc0MWY0NTcyODBlMmMwMTkzZTEyZDM0NmEmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.YjA2w2Ydn0mpObjkPNw5cfLBXS52OtKBmtcWQgSAi2w)
Usability here could definitely be improved on our end, by having the first step of the algorithm find the closest identifier and uses that to kick off the rest of the process.
@andreaphylum Thanks for the clarifications. Checking for access to the function that contains vulnerability is a good compromise between the accuracy of the static analysis and the ease of defining vulnerability locations. But not all functions are of type identifier
. In this example, putCookie
is not defined with the function
keyword but as a class member:
class MemoryCookieStore extends Store {
putCookie(cookie, cb) {
if (!this.idx[cookie.domain]) {
this.idx[cookie.domain] = {};
}
}
This produces the following syntax tree:
![image](https://private-user-images.githubusercontent.com/23212287/286651552-f92b6d55-e476-45c8-9fbe-b86011811b06.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTk2NDEwNzEsIm5iZiI6MTcxOTY0MDc3MSwicGF0aCI6Ii8yMzIxMjI4Ny8yODY2NTE1NTItZjkyYjZkNTUtZTQ3Ni00NWM4LTlmYmUtYjg2MDExODExYjA2LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjI5VDA1NTkzMVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTEwZTZiNWFlYWM2NzM3NzhkYjYwZDE0MDg3ODZlNWI1YzlhNTFiZTQ2YzZhYmU2ZTI3MGE5NGExMDQ0MzY3YTgmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.oiLrH2ksDIMnR_tFZ5iDZCqnTdtRvhjFw44vSA25cMk)
putCookie
is of type property_identifier
. The only identifier
near the vulnerable code is the function argument cookie
. Since cookie
is accessed in the same line as the vulnerable code {}
, I used that as the vulnerability location which worked.
Yeah, unfortunately class methods aren't statically addressable as they end up as dynamic properties of class objects. cookie
is a perfectly fine choice there, as it's going to target the same scope.