leak not found in llvm generated with rust
StamesJames opened this issue · 6 comments
- I have searched open and closed issues for duplicates
- I made sure that I am not using an old project version (DO: pull Phasar, update git submodules, rebuild the project and check if the bug is still there)
Bug description
I try to use phasar to check LLVM code generated with rust. For this I wrote some simple test programs (https://github.com/sse-labs/PhASARust) but I haven't managed to analyze
them properly. I use rustc versions smaller 1.61.0 because they uses LLVM version lower or equal to 14.0.0. The phasar ifds-solvertest accepts the generated LLVM code. I tried to find a leak in the following rust code:
#[inline(never)]
#[no_mangle]
fn source() -> i32 {1029384756}
#[inline(never)]
#[no_mangle]
fn sink(source: i32) -> i32 {source}
#[inline(never)]
#[no_mangle]
fn sanitize(source: i32) -> i32 {source}
fn main() {
let unsanitized = source();
let source = source();
let sanitized = sanitize(source);
let sink_unsanitized = sink(unsanitized);
let sink_sanitized = sink(sanitized);
println!("{sink_unsanitized}");
println!("{sink_sanitized}");
}
In my understanding this should be a Leak because the variable unsanitized get's into the sink function without passing through the sanitize function first. I use the following analysis-config.json
{
"name": "simple sql injection",
"version": 1,
"functions": [
{
"name": "source",
"ret": "source",
"params": {
}
},
{
"name": "sink",
"params": {
"source": [1]
}
},
{
"name": "sanitize",
"ret": "sanitizer",
"params": {
"source": [1]
}
}
],
"variables": []
}
And the following compiler-flags for rust:
[build]
rustflags = [
"--emit=llvm-ir",
"-Cno-prepopulate-passes",
"-Cdebuginfo=0",
"-Copt-level=0",
]
The main function (without the print statement) in LLVM looks like this:
; sql_injection_02_simple_requests::main
; Function Attrs: uwtable
define internal void @_ZN32sql_injection_02_simple_requests4main17hb222746dbaa73089E() unnamed_addr #1 {
start:
%_29 = alloca [1 x { i8*, i64* }], align 8
%_22 = alloca %"core::fmt::Arguments", align 8
%_17 = alloca [1 x { i8*, i64* }], align 8
%_10 = alloca %"core::fmt::Arguments", align 8
%sink_sanitized = alloca i32, align 4
%sink_unsanitized = alloca i32, align 4
%unsanitized = call i32 @source()
br label %bb1
bb1: ; preds = %start
%source = call i32 @source()
br label %bb2
bb2: ; preds = %bb1
%sanitized = call i32 @sanitize(i32 %source)
br label %bb3
bb3: ; preds = %bb2
%0 = call i32 @sink(i32 %unsanitized)
store i32 %0, i32* %sink_unsanitized, align 4
br label %bb4
bb4: ; preds = %bb3
%1 = call i32 @sink(i32 %sanitized)
store i32 %1, i32* %sink_sanitized, align 4
br label %bb5
.
.
.
Rust generates bloated LLVM code so I posted only the main function without the print statement.
I attache all relevant files below.
Steps to reproduce
- create a cargo project
- place config.toml in a .cargo folder in the project
- place rust-toolchain.toml in root of project
- use attached main.rs
- compile with cargo b (should use the correct rustc version and compiler-flags)
- try to analyze the resulting LLVM code with a phasar taint analyzes
Actual result: phasar doesn't find the leak
Expected result: phasar should find a leak
Context (Environment)
- phasar: [2a941ee]
Operating System:
- Linux
- Windows
- macOS
Build Type:
- cmake
- custom build
Example files
Files:
examplefiles.zip
Hi @StamesJames,
thanks for reporting this issue. Reproducing it on my system I found errors both in PhASAR and in your use of PhASAR:
- The indices in the analysis config are zero-based, so first parameter has index 0. Further, the "source" attribute within "params" indicates that the parameters at the given indices should be treated as source (other options are "sink" and "sanitizer").
A corrected config may look like this:
{
"name": "simple sql injection",
"version": 1,
"functions": [
{
"name": "source",
"ret": "source",
"params": {
}
},
{
"name": "sink",
"params": {
"sink": [0]
}
},
{
"name": "sanitize",
"ret": "sanitizer"
}
],
"variables": []
}
- Rust has some indirection layers between the real
main
function and the source-codemain
(_ZN4main4main17h06bd8598508590c8E
). Specially, PhASAR's call-graph resolver currently cannot infer that_ZN3std2rt19lang_start_internal17h2ba92edce36c035eE
indeed calls the main function.
As a workaround, you can use the--entry-points
CLI flag to tellphasar-cli
to start at_ZN4main4main17h06bd8598508590c8E
instead. - PhASAR currently analyzes functions annotated in the taint config. Therefore, the sanitizer does not work and two leaks instead of just one are reported (if it would be just an external function declaration, it would work, though). This should be fixed by https://github.com/secure-software-engineering/phasar/tree/f-FixTaintAnalysis.
Does this help with your issue?
Hi @fabianbs96,
thanks a lot. Now it works to find the leak and I understand way better how to use PhASAR. I will try more complex examples next.
I tried to change the function definitions to:
declare external i32 @source()
declare external i32 @sink(i32 %source)
declare external i32 @sanitize(i32 %source)
and it correctly detected just the one leak.
Do I understand your third point right, that once the the f-FixTaintAnalysis branch is merged it also should work with the function definitions?
Do I understand your third point right, that once the the f-FixTaintAnalysis branch is merged it also should work with the function definitions?
Yes, exactly
Hi @StamesJames, can we assume this issue to be resolved?
Hi @MMory yes sorry. I'm not sure about the procedure. Should I close the issues I opened when I think the question is answered? Sorry this is the first issue I contributed to someones github repo
Hi @StamesJames, I'm gonna close this issue now. Feel free to close your own issues when you consider them resolved :)