ILF is an Imitation Learning based Fuzzer for smart contracts. The fuzzing policy, which is used to generate transactions, is represented by an ensemble of neural networks and is learned from thousands of high-quality sequences of transactions generated using symbolic execution. ILF can be used to fuzz any Ethereum smart contract and outputs the coverage and a vulnerability report.
ILF is developed at SRI Lab, Department of Computer Science, ETH Zurich as part of the Machine Learning for Programming and Blockchain Security projects. For mode details, please refer to ILF CCS'19 paper and slides.
We provide a docker file, which we recommend to start with. To build and run:
$ docker build -t ilf .
$ docker run -it ilf
We provide the procedures for local setup (tested on Ubuntu 18.04).
Install golang, for example:
$ wget https://dl.google.com/go/go1.10.4.linux-amd64.tar.gz
$ tar -xvf go1.10.4.linux-amd64.tar.gz
$ sudo mv go /usr/lib/go-1.10
$ echo 'export GOPATH=$HOME/go' >> ~/.bashrc
$ echo 'export GOROOT=/usr/lib/go-1.10' >> ~/.bashrc
$ echo 'export PATH=$PATH:$GOPATH/bin' >> ~/.bashrc
$ echo 'export PATH=$PATH:$GOROOT/bin' >> ~/.bashrc
$ source ~/.bashrc
Install z3:
$ git clone https://github.com/Z3Prover/z3.git
$ cd z3
$ git checkout z3-4.8.6
$ python3 scripts/mk_make.py --python
$ cd build
$ make -j7
$ sudo make install
Clone this repo:
$ mkdir -p $GOPATH/src
$ cd $GOPATH/src
$ git clone https://github.com/eth-sri/ilf.git
Clone go-ethereum and apply our patch:
$ mkdir -p $GOPATH/src/github.com/ethereum
$ cd $GOPATH/src/github.com/ethereum
$ git clone https://github.com/ethereum/go-ethereum.git
$ cd go-ethereum
$ git checkout 86be91b3e2dff5df28ee53c59df1ecfe9f97e007
$ git apply $GOPATH/src/ilf/script/patch.geth
Install python dependencies:
$ cd $GOPATH/src/ilf
$ pip3 install -r requirements.txt
Install execution backend:
$ go build -o execution.so -buildmode=c-shared export/execution.go
The following steps are necessary only when you want to use ILF to fuzz new contracts other than our example. Install nodejs, Truffle, web3.js and Ganache-CLI:
$ curl -sL https://deb.nodesource.com/setup_12.x | sudo -E bash -
$ sudo apt-get install nodejs
$ mkdir ~/.npm-global
$ npm config set prefix '~/.npm-global'
$ echo 'export PATH=~/.npm-global/bin:$PATH' >> ~/.bashrc
$ source ~/.bashrc
$ npm install -g truffle web3 ganache-cli
Install solc 0.4.25:
$ wget https://github.com/ethereum/solidity/releases/download/v0.4.25/solc-static-linux
$ chmod +x solc-static-linux
$ sudo mv solc-static-linux /usr/bin/solc
To fuzz the example provided in the repo with ILF (the imitation
fuzzing policy) using our pre-trained model in the model
directory:
$ python3 -m ilf --proj ./example/crowdsale/ --contract Crowdsale --fuzzer imitation --model ./model/ --limit 2000
The --fuzzer
argument can be replaced by:
random
: a uniformly random fuzzing policy.symbolic
: a symbolic execution fuzzing policy based on depth first search of block states. This is used for generating training sequences.sym_plus
: an augmentation ofsymbolic
which can revisit encountered block states.mix
: a fuzzing policy that randomly choosesimitation
orsymbolic
for generating each transaction.
For fuzzing new contracts, one needs to provide a Truffle project (formatted as the example in example/crowdsale
). Then the script script/extract.py
should be called to extract deployment transactions of the contracts. For the example contract, the script runs as follows:
$ rm example/crowdsale/transactions.json
$ python3 script/extract.py --proj example/crowdsale/ --port 8545
Note that you need to kill existing ganache-cli
processes listening the same port before calling this script.
For training, one needs to run symbolic
on a set of training contracts to produce a dataset in a training directory. Usually tens of thousands of contracts are used for training. For demonstration purposes, we show how to produce a small training dataset from our example contract to the train_data
directory:
$ mkdir train_data
$ python3 -m ilf --proj ./example/crowdsale/ --contract Crowdsale --limit 2000 --fuzzer symbolic --dataset_dump_path ./train_data/crowdsale.data
Run the scripts to select seed integer values and amount values from the training dataset, and put them into ilf/fuzzers/imitation/int_values.py
and ilf/fuzzers/imitation/amounts.py
, respectively:
$ python3 script/get_int_values.py --train_dir ./train_data
$ python3 script/get_amounts.py --train_dir ./train_data
Then the following command performs neural network training and outputs the trained networks in the new_model
directory:
$ mkdir new_model
$ python3 -m ilf --fuzzer imitation --train_dir ./train_data --model ./new_model
@inproceedings{He:2019:LFS:3319535.3363230,
author = {He, Jingxuan and Balunovi\'{c}, Mislav and Ambroladze, Nodar and Tsankov, Petar and Vechev, Martin},
title = {Learning to Fuzz from Symbolic Execution with Application to Smart Contracts},
booktitle = {Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security},
series = {CCS '19},
year = {2019},
isbn = {978-1-4503-6747-9},
location = {London, United Kingdom},
pages = {531--548},
numpages = {18},
url = {http://doi.acm.org/10.1145/3319535.3363230},
doi = {10.1145/3319535.3363230},
acmid = {3363230},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {fuzzing, imitation learning, smart contracts, symbolic execution},
}
- Jingxuan He
- Mislav Balunović
- Nodar Ambroladze
- Petar Tsankov
- Martin Vechev
- Anton Permenev
- Copyright (c) 2019 Secure, Reliable, and Intelligent Systems Lab (SRI), ETH Zurich
- Licensed under the Apache 2.0 License