Idea: check for overwritten values instead of using a blocklist
Opened this issue · 1 comments
The functions currently listed in dangerous_functions
are probably the most insidious. But in principle any (especially commonly used) predefined functions have the same issue, e.g. <-
, [
, etc. ls()
is also a good candidate: personally I basically never use load()
but when I do I immediately follow it by running ls()
, even though I know that the verbose = TRUE
flag would give me the same (it’s just a compulsion I guess).
An alternative approach to consider would be to check whether any loaded name shadows or overwrites an existing name (rather than hard-code the list of known-dangerous functions). I played around with this idea in a toy ‘safeload’ package. The basic logic is as follows:
new_names = load(args[1], envir = quarantine)
sentinel = new.env() # (1)
hook_funs = list2env(list(.Last = {}, .Last.sys = {}, parent = .GlobalEnv) # (2)
global_values = mget_uneval(new_names, hook_funs, ifnotfound = list(sentinel), inherits = TRUE) # (3)
ok = vapply(global_values, identical, logical(1L), sentinel)
if (any(! ok)) {…}
Notes:
new.env()
gives us a value that should be (!) non-identical to any existing value.- Adding this environment to the search ensures that
.Last()
and.Sys.last()
are found, even if the user hasn’t currently defined them. mget_uneval()
is likemget()
but without evaluating active bindings.- Another difference / potential limitation is the fact that this only checks for names predefined in standard packages, not for whatever else the user might have attached or defined in
.GlobalEnv
, since it’s run from a standalone, containerised script. A savvy attacker might make an educated guess about other things the user has already defined in their environment, or about packages they are likely to use. For instance, hijacking the namesselect()
,ggplot()
etc. has good chances of success.
This is a much better approach than my hastily crafted one.