- Installation guide with OpenSSL example
- Adding support for a new implementation/protocol
- Case Studies
WolfSSL experiments:
results/wolf
Experiment to demonstrate impact of running StateInspector without tainting:
results/no_taint
^ This experiment was performed with a patched version of Statelearner, the patch is located at: experiments/results/no_taint/MemoryEquivalenceOracle.java.diff
.
This patch comments out the logic in the learning algorithm which decides to use taint analysis. If code with this patch is run, if 2 states differ in state memory they will not be merged, and the learner will continue exploring.
The impact of this is likely to be non-termination, since without the ability to determine if differing memory is in fact relevant, the algorithm will never merge states with counters, and will explore infinitely. The openssl experiment is an example of this: repeatedly sending applicationEmpty messages and detecting a difference in state memory due to a counter.
- Required
- Clone this respository with
git clone --recurse-submodules https://github.com/ChrisMcMStone/greyboxstatelearning
- Maven (e.g.
sudo apt install maven
) - Other dependencies:
sudo apt install openjdk-14-jdk-headless maven python2 nodejs graphviz
- Clone this respository with
- Optional (configurable in learner properties file)
- Taint analysis support - install manually with instructions here or via a Docker container
sudo docker build -tag=“taint:Dockerfile” greyboxstatelearning/statemem-taint/Dockerfile
. Launch shell in this container withsudo docker run -it taint:Dockerfile /bin/bash
. TODO - update this once dockerisation fully implemented - IDA-Pro - required by Triton for extracting binary basic blocks.
greyboxstatelearning/statemem-taint/res/idapro.tar.gz.enc
can be decrypted using instructions here. This should then be installed within your WINE set-up. - Broswer based dynamic model rendering - depends on
node
which can be installed withcurl -sL https://deb.nodesource.com/setup_13.x | sudo -E bash -
andsudo apt-get install -y nodejs
. Then unzip the node libraries in the directorycd greyboxstatelearning/js_mode_renderer/myapp
withunzip libs.zip
.
- Taint analysis support - install manually with instructions here or via a Docker container
From greyboxstatelearning root directory:
- Ptrace -
cd ptrace-statemem && make statemem/statemem && cd ..
- Learner -
cd statelearner && mvn package shade:shade && cd ..
From greyboxstatelearning root directory:
- Extract and compile:
cd openssl && unzip openssl-b2d951e.zip && cd openssl-b2d951e && ./config && make && cd ../..
- Launch shell with ASLR disabled: e.g.
setarch `uname -m` -R /bin/bash
- Start openssl server:
openssl/openssl-b2d951e/apps/openssl s_server -key openssl/server_key/diagserverCA.key -cert openssl/server_key/diagserverCA.pem -CAfile openssl/server_key/diagserverCA.pem -accept 4500 -www
Note: these instructions can be followed in-combination with this for testing the WiFi handshake of hostapd.
- Modify
statelearner/openssl-config.properties
to match filepaths of your system. Including,base_path
,output_dir
,bin_path
extra_binaries
,keystore_filename
,keystore_password
. If Triton & IDA-Pro dependencies are installed, theenable_taint_analysis=true
option can also be set, along with options for printing the source code at watchpoint hits withprint_watchpoint_hit_source_info=true
. You can set a custom python command usingpython_cmd
-- useful if you are using a virtual environment; an example is given here). - Build statemem:
cd ptrace-statemem && make statemem/statemem
- Start the learner
cd statelearner && ./run_learner.sh openssl-config.properties
To view the model of the learned state machine, we can use the node web-app which animates the contruction of the state machine in real time. If you have set the inc_model_drawing=true
in the properties, and have installed node as described aboe, you can open a web-browser at http://localhost:3000/index.html
.
Alternatively we can open the PDF model which is generated upon learning termination: evince /output_dir/full_model.pdf
. A more succint/readable version of the model can be generated with the sudo /statelearner/model_cleaner.py /output_dir/full_model.dot && dot -Tpdf -O /output_dir/full_modelstripped.dot && evince /output_dir/full_modelstripped.dot.pdf
You can modify the debugging output verbosity by editing the file statelearner/src/logging.properties
.
A test harness is required for each protocol and can be shared across implementations of the same protocol. There are two ways one can implement a test harness:
- Directly in the
statelearner
Java project, as done with the TLS test harness (seestatelearner/src/tls/TLSSUL.java
). Or, - Independently, with whichever langauge desired. Your test harness will communicate with the learner over a TCP connection, using the
statelearner/src/socket/SocketSUL.java
interface. The connection details are configurable in theproperties
file.- The test harness should be able to process all inputs specified by the
alphabet
properties parameter, and additionly theRESET
command to force the target to the start state. The test harness should return the stringEmpty
, if the target does not respond to an input within your specified timeout. - [Optional] By default, the learner will handle the timestamped I/O log generation (for tagging snapshots with the I/O metadata). However, it particularly time sensitive applications, it may be desireable to implement this in the test harness. We do exactly this with
hostap
. Here, we set theharness_side_logging=true
in the properties file, and then eachRESET
command at the start of a query will now take the formRESET:/file/path/to/io_log_for_sessionID
- this filepath informs the test harness where to write the I/O log too. - Examples of this type of test harness can be found with the hypothetical and hostapd implementations.
- The test harness should be able to process all inputs specified by the
The tool we use for to dump memory snapshots needs to be adapted to support the new implementation. In particular, we need to modify the C file ptrace-statemem/statemem/statemem.c
as follows:
- Implement a method of the form
static void post_call_dump_heap_APPNAME(trace *t, void *data)
. A template is provided on line 533. To automatically identify the syscall hooking parameters specified in the above method, we have provided a tool named iolearn. To use:- Navigate to
/greyboxstatelearning/ptrace-statemem
, runmake iolearn/iolearn
. Launch your target binary and then determine it's PID. Then runsudo ptrace-statemem/iolearn/iolearn -pid `pgrep appname` -session-id 1
. - Execute the happy flow of the protocol against your target (using your test harness), and let iolearn generate a syscall log for later processing. Kill iolearn, and the log
appname_1.log
will be written. - Then execute the python script
python3 greyboxstatelearning/iolearner/iolearn.py --log /path/to/appname_1.log
. Output will look something like :[{"fd": 6, "syscalls": [20, 45]}]
, wherefd
is the socket file descriptor, and thesyscalls
numbers are the read and write syscall IDs respectively.
- Navigate to
- Add a support for registering this newly implemented hooking method via a
ptrace-statemem
command line arg named-app
. See lines 805-823 for how this is implemented for openssl, hostapd etc..