This is an open project to construct a benchmark of Solidity smart contracts for evaluating and comparing verification tools.
The repo contains a set of use cases: each use case is associated with a set of Solidity implementations (possibly, containing bugs), and a set of properties against which to assess verification tools. Here we are also interested in properties that go beyond the capabilities of the current tools, hoping that they can be of inspiration for more precise verification techniques.
We are currently updating the contracts to the standard described in the "extending the benchmark" section. Up-to-date contracts are marked with ✅.
The benchmark currently comprises several versions (correct or bugged) of the following use cases:
- Call Wrapper ✅
- Bet (tokenless) ✅
- Deposit (ETH) ✅
- Deposit (ERC20) ✅
- Bank (tokenless)
- Bank (ETH) ✅
- Escrow ✅
- Vault ✅
- Crowdfund ✅
- Hash Timed Locked Contract ✅
- Vesting Wallet ✅
- Lottery
- Constant-product AMM
- Payment Splitter
- Social Recovery Wallet
Currently the benchmark supports the following verification tools:
For each use case, we evaluate the performance of a verification tool as a matrix, where columns represent different contract properties, and rows represent different implementations of the use case. For each entry of the matrix, we summarize the output of the tool as follows:
Symbol | Meaning |
---|---|
TP | True Positive (property holds, verification succeeds) |
TN | True Negative (property does not hold, verification fails) |
FP | False Positive (property does not hold, verification succeeds) |
FN | False Negative (property holds, verification fails) |
UNK | Timeout / Memory exhaustion |
ND | Property not definable with the tool |
Additionally, we mark with ! the classifications TP,TN,FP,FN, when the verification tool guarantees the correctness of the output. Following our methodological notes, we map the outputs of the verification tools according to the following table:
Suffix | SolCMC output | Certora output |
---|---|---|
P | Satisfy green | |
P! | Property is valid | Assert green |
N | Property might be false | Assert red |
N! | Property is false | Satisfy red |
To compute scores for each verification tool, navigate to the
contracts/
directory and execute the following command in your
terminal:
$ make
or
$ make scores
This commands will generate a .csv
file where each tool is represented by a
row, displaying the count of different outcomes and the total score of the
tool. Please ensure that you have previously run experiments in the
corresponding usecase directories, as this process relies on experiments
results.
In the contracts/
directory, run the following command to initialize a new use case:
$ make init name=<usecase-name>
This command creates a new directory and provides the template to start your work.
Each use case directory must include the following files:
skeleton.json
ground-truth.csv
versions
directoryMakefile
- A directory for every verification tool used
Find a minimal example in contracts/template/
directory.
This file stores the use case's name, credits, specification and properties defined in natural language:
{
"name": "Simple Transfer",
"credits": "[Author](https://www.author.com/simple_transfer)" // optional
"specification": "The contract has an initial balance...",
"properties": {
"sent_a": "the overall sent amount does not exceed the initial deposit.",
...
}
}
You can store specifications in a separate file, use the following syntax to indicate the path:
"specification": "file:<relative_path>",
Here, the file path is relative to the path of the use case (e.g.
simple_transfer/spec.md
would be file:spec.md
).
This file defines the ground truth for the corresponding use case. Lines of the csv have the form:
property,version,sat,footnote
where sat
is 1 when the property holds on the given version, and 0 when it
does not hold. The ground truth is established manually and, in some cases,
confirmed by the verification tools. Furthermore, there is the option to append
footnotes, which will be displayed in the readme file of the use case.
The versions/
directory contains various Solidity variants of the use case
contract, with version definitions in natural language written using the
NatSpec format and the @custom:version
tag:
/// @custom:version <version definition>.
The Makefile defines the following commands:
make plain
: generates the README without experiment results. It utilizesskeleton.json
,ground-truth.csv
and version files fromversions/
.make solcmc
: runs the SolCMC experiments. By default, the timeout is set to 10 minutes. Usemake solcmc to=<int>
to set a different timeout for each query in seconds.make certora
: runs the Certora experiments; results are written in the README.make all
: runs experiments with all verification tools and generates the complete README.make clean
: removes build directories from verification tool directories.make clean-solcmc
: removes solcmc build directories.make clean-certora
: removes certora build directories.make cleanr
: removes the README.
SolCMC directories contain:
Makefile
: to setup and run solcmc experiments.- Property files.
Four types of instrumentation are available for SolCMC verification: ghost state, function preghosts, function postghosts, and invariants. To apply these, use the following tags within your property files:
/// @custom:ghost
: Defines ghost contract variables./// @custom:preghost function <function name>
: Defines ghost code to be executed before the body of a specific method./// @custom:postghost function <function name>
: Defines ghost code to be executed after the body of a specific method./// @custom:invariant
: Declares conditions that must remain valid throughout the entire contract execution, expressed through functions.
Example of a SolCMC property file:
/// @custom:ghost
uint _x = 0;
uint _y;
/// @custom:preghost constructor
require(_x > 0);
/// @custom:postghost constructor
assert(z < 10);
/// @custom:postghost function f1
assert(_y == 1);
/// @custom:invariant
function invariant(uint z) public {
require(x != y);
f1(x, y);
f2();
assert(x != y);
}
Certora directories contain:
Makefile
: to setup and run certora experiments.getters.sol
: a collection of getters for contract state variables, useful to write certora specifications.methods.spec
: methods declaration to use in certora specifications.- Specification files.
An example of a specification file:
rule P1 {
env e;
mathint y;
x = getX();
require x != y;
f1(x,y);
f2();
assert x != y;
}
Property files must follow the specified naming conventions:
- For general properties, the file should be named as
p<property_number>.sol
. - If the property is associated with a specific contract version, use the
format
p<property_number>_v<version_number>.sol
.
The tool manages the matching of properties and versions. It prioritizes version-specific properties; if a version-specific definition of the property exists, the tool will use it. Otherwise, it will default to the generic one.
Incorporate the following tags into your properties files to enable specific features of the benchmark:
/// @custom:nondef <note>
: Indicates the reason why a property is deemed nondefinable. When this tag is set, the associated experiment for this property will not be executed. Additionally, the specified note will appear to the automatically generated readme as a footnote./// @custom:negate
: Inverts the result provided by the tool.