Operation Mango [Paper PDF]
Common vulnerability discovery techniques all follow a top-down scheme: They start from the entry point of the target program, reach as deep as possible, and examine all encountered program states against a set of security violations. These vulnerability discovery techniques are all limited by the complexity of programs, and are all prone to the path/state explosion problem.
Alternatively, we can start from the location where vulnerabilities might occur (vulnerability sinks), trace back, and verify if the corresponding data flow may lead to a vulnerability. On top of this, we need “assumed execution”, which means when we are tracing back from a vulnerability sink to its sources, we do not faithfully execute or analyze every function on the path, instead we assume a data flow based on prior knowledge or some static analysis in advance and skip as many functions as possible during back tracing.
Checkout our experiment reproduction section to reproduce all the figures found in the paper.
There are several ways to run operation mango if you so choose.
Bypass all this non-sense and just use the container.
Tip
Don't forget to add volumes with -v for both the binary and result directory
docker run -it cl4sm/operation-mango:latest
I highly recommend you create a separate python virtualenv for this method.
source venv/bin/activate
git clone https://github.com/sefcom/operation-mango-public.git
cd operation-mango-public
pip install .
To build the docker container locally:
cd operation-mango-public
docker build -f docker/Dockerfile . -t mango-user
Once you install Operation Mango or use the docker container, you'll have access to two commands: mango
and env_resolve
.
mango
is your default command for running our basic taint analysis on binaries.
Tip
Using the --concise
will significantly shrink the output size and speed up analysis.
It will not print the entire analysis taint trace in your results, but normally you won't need that.
mango /path/to/bin --results your_res_dir
will run the basic command injection taint analysis, checkout the --help
flag for more options.
The output of this tool is fairly verbose, you'll be given the following:
Tip
Any values labeled as TOP are of unknown or unresolvable values
{category}_mango.out
- The entire stdout/stderr of the mango run.
{category}_results.json
- The json is as follows:
{
"closures": [
{
"trace": {} // Function trace starting from input down to sink
"sink": {} // Sink location
"depth": int
"inputs": {
"likely": [] //Functions that flow directly into the sink
"possibly": [] //Functions seen along the way that generally are used as inputs
}
"rank": int // How confident are we this function is a TruPoC
},
],
"cfg_time": float // time it took to generate the cfg in seconds
"vra_time": float // time it took to run the variable recovery in seconds
"mango_time": float // time it took for actual mango analysis in seconds
"path": str // path to analyzed binary
"name": str // binary name
"sha256": str // sha256 of the file
"error": str|None // If an error occured print it here
... // Other timing info
}
{category}_closures/
- The folder containing the results of individual flows to the sink, all of these are unresolvable by our tool.
{category}_closures/0.{rank}_{entry_func@addr}_{sink_func@addr}
- The individual closures printed with extra information about likely and possible input sources.
e.g. 0.70_main_0x403e70_system_0x40143c
:
|||system(
||| a0: <BV32 0x4455f8> -> "<BV32 TOP>",
||| ) @ 0x40143c -> <BV32 0x0>
INPUT SOURCES:
Likely:
NONE
Possibly:
----------
KEY: "accept(fd: 3)@0x403fb0_274_3"
Keywords: None
Binary Source - UNKNOWN
socket(AF_INET, SOCK_DGRAM, 0)_273_32
accept(fd: 3)@0x403fb0_274_32
recv(accept(fd: 3)@0x403fb0_274_32)@0x404088
----------
RANK: 0.700
execv.json
- This is mostly unused but should contain info about which other processes this binary tries to execute.
env_resolve
performs a taint analysis of a given binary to find all uses of env
and nvram
variables.
This is what enables our cross-binary bug finding.
env_resolve /path/to/bin --results your_res_dir
The output of this tool will be found at your_res_dir/env.json
.
To feed this info into mango
merge all of the env.json files together (even if there is only one) with
env_resolve /path/to/bin --results your_res_dir/env.json --merge
This will spit out the file your_res_dir/env.json
.
Then feed it into mango
.
mango /path/to/bin --env-dict your_res_dir/env.json --results your_res_dir
The env.json
output from env_resolve
follows the results.json
that mango
outputs e.g.
{
"results": {
"func_name": // i.e. nvram_get
{
"key_name": //key used to retrieve the value i.e. "http_passwd
{
"keywords": str // Any frontend keywords used to retrieve this value
"1": // position of the argument starting from "1" (i know...)
{
"arg_value": [ // arg value, in the case of getter funcs it's always the key name.
"0xaddr", // addr where the value is used
]
}
}
}
},
"cfg_time": float // time it took to generate the cfg in seconds
"vra_time": float // time it took to run the variable recovery in seconds
"analysis_time": float // time it took for actual mango analysis in seconds
"path": str // path to analyzed binary
"name": str // binary name
"sha256": str // sha256 of the file
"error": str|None // If an error occured print it here
... // Other timing info
}
If you're trying to find bugs in some firmware samples as described in our paper, then have a look at the mango_pipeline
Here
.
For further examples of how to use this checkout the Experiment Replication section.
# run all the tests for the developed features (isolated in the `package` module)
pip install pytest-cov
pytest
To ease testing, we crafted small binaries highlighting one (or several) case(s) we wanted to be able to handle properly.
It was particularly helpful to drive the development of the Handlers
.
They are located under the package/tests/binaries/
folder.
Binary | Description |
---|---|
after_values/program |
Contains multiple calls to a sink in a single function. |
layered/program |
Nested calls running more than the default 7-depth limit before reaching the sink. |
looper/program |
Runs a loop before reaching a sink. |
off_shoot/program |
Calls multiple functions that alter the input in sub functions before reaching the sink. |
recursive/program |
Contains direct and in-direct recursive calls (Highlights flaw of unresolvable call-depth). |
nested/program |
Nested calls and returns before reaching a sink. |
simple/program |
Contains call to external function puts . Run through nested functions, leading to different sinks (execve , system ). |
sprintf_resolved_and_unresolved/program |
Contains two calls to system : one with constant data, the other one that could be influenced by the program user. |
To ensure reproducibility of testing, the binaries have been added to the repository.
Although, if looking to add a new one, a Makefile
has been written for convenience.
# build some homemade light binaries
cd binaries/ && make && cd -