RustBugDetector

Statically detect Rust bugs. This is experimental. I hope to migrate it to MIR in the future to get a more accurate result.

Description

Currently, we implement an interprocedural double-lock detector, which works on LLVM bc generated by Rust compiler. The tool works on Ubuntu 16.04. I did not test on other OSes, but I think other Linux versions may also work. In the future, we will add more detectors. It is non-trivial to implement a use-after-free checker on LLVM bc. I have given up on this checking. MIRI provides a way to detect memory bugs by interpreting MIR but it is hard for me to get familiar with the Rust compiler interface. Hope they can have a stable API and a more user-friendly hello-world tutorial on MIR parsing.

Install

1. Install Rust

https://www.rust-lang.org/tools/install

2. Install llvm9.0

http://releases.llvm.org/download.html#9.0.0

Select pre-built binaries according to your OS version. After extracting to your target directory, say, $HOME/Env/llvm, you need to add the following code to your environment file (For my OS, it is $HOME/.bashrc).

LLVM_INSTALL_DIR=$HOME/Env/llvm
export PATH=${LLVM_INSTALL_DIR}/bin:$PATH
export LLVM_DIR=${LLVM_INSTALL_DIR}/lib/cmake
export CMAKE_PREFIX_PATH=${LLVM_INSTALL_DIR}/lib/cmake
export LD_LIBRARY_PATH=${LLVM_INSTALL_DIR}/lib:$LD_LIBRARY_PATH

3. Clone the repository and run

cd RustBugDetector/cmake-build-debug
cmake ..
make

If every setting is OK, then you will get a so file: lib/NewDoubleLockDetector/libNewDoubleLockDetector.so

4. Test

We will test our tool on an older version of parity-ethereum.

git clone git@github.com:parity-ethereum/parity-ethereum.git
cd parity-ethereum
git checkout 93fbbb9aaf161f21471050a2a3257f820c029a73

Now we are on a buggy branch of parity-ethereum, next we will generate bc for detection. Find all the Cargo.toml and append the following code to it. If the field [profile.dev] exists, change it to the following code.

[profile.dev]
opt-level = 0
debug = true
lto = false
debug-assertions = true
panic = 'unwind'
incremental = false
overflow-checks = true

Then, run the following command in each directory where Cargo.toml resides.

cargo rustc -- --emit=llvm-bc

You can choose cargo rustc --lib -- --emit=llvm-bc or cargo rustc --bin XXX -- --emit=llvm-bc if cargo complaints.

Now you can get the bc files in target/debug/deps. Do not use the bc files in incremental!

Then execute the following commands. Change file name and the path accordingly.

opt -mem2reg ethcore-XXX.bc > ethcore-XXX.m2r.bc
opt -load libNewDoubleLockDetector.so -detect ethcore-XXX.m2r.bc > /dev/null 2> double_lock_result.txt

The results are in double_lock_result.txt The format is the project dir, the file path, and the line number, separated by a space. The long name is the function name that contains the second lock.

...
Double Lock Happens! First Lock:
 /XXXX/parity-ethereum ethcore/src/client/client.rs 386
_ZN7ethcore6client6client6Client17build_last_hashes17h70161e43ec016505E
 /XXX/parity-ethereum ethcore/src/client/client.rs 411
Second Lock(s):
 /XXX/parity-ethereum ethcore/src/client/client.rs 931
...

We can read from the results that the first lock occurs on L386, then it calls fn build_last_hashes() on L441. The second lock occurs on L931 in this function.

Note that the parking_lot::RwLock::read() is not the RwLock in std Rust, A deadlock may happen once write() in thread A interleaves in between two read() in thread B. So we need to avoid calling two read() in one thread. Thus this pattern follows the double lock checking. The details and the fix are in openethereum/parity-ethereum#11172 openethereum/parity-ethereum#11175 openethereum/parity-ethereum#11176

5. Workflow

To find a double-lock, we need to first identify which locks occur on the same mutex/rwlock, and then find the corresponding drop instruction of the lock, where unlock is called. Here is our current implementation:

Traverse the bc for lock function on the same lock type. We can do this because the lock function in Rust carries the type information of the protected data. We have to do this in that it can cover more cases and it is easy to implement. This may cause FP like client.rs 2230 and client.rs 46, where the two different locks share the same type.
Find drop instruction of the each lock by tracking the return value of the lock function.
Start from one lock function and traverse the successor Basic Blocks. If the lock of the same type is met, then report double-lock; if the drop of the same lock is met, then stop propagation on this Basic Block; Otherwise, recursively traversing the successor Basic Blocks. This is implemented with a WorkList. Once a function calls other than lock/drop is met, we recursively check if there is any lock of the same type in the callees.

Joseph220591/RustBugDetector