IDA 7.5 can see through obfstr 0.2 obfuscation

Question

IDA 7.5 can see through obfstr 0.2 obfuscation

Closed this issue 4 years ago · 4 comments

When decompiling a binary obfuscating with obfstr 0.2 with IDA/Hexrays 7.5, the obfuscation is immediately undone, and the XREF immediately give away the string location. See below for the source code, and the decompiler and asm output. This was tested with rust 1.48.0.

obfstr 0.1 does not suffer from this: The strings are properly obfuscated, and IDA fails to find any xrefs for it. I believe the difference comes from obfstr 0.1 having a random offset into the string, which confuses IDA, leaving its XREF analysis and automatic deobfuscators inoperable. Another issue is that the one time pad to deobfuscate the string is kept in .rodata (due to being passed as an array to the deobfuscate method) while obfstr 0.1 would only get passed the initial round key.

I imagine the fix would go like this: Change the deobfuscate method to take a random offset like obfstr 0.1 in order to break the xrefs, and generate the XOR round keys at runtime instead of compile time.

Source code:

use obfstr::obfstr;
fn main() {
    println!("aaaaaaaaaaaa");
    println!("{}", obfstr!("bbbbbbbbbbbb"));
    println!("{}", obfstr!("cccccccccccc"));
}

0.2 Hexrays decompiler view

0.1 Hexrays Decompiler View

0.2 Assembly view

0.2 Data X-Refs

0.1 Data X-Refs

Answer 1 · 2021-01-07T13:36:53.000Z

Thanks for the analysis!

My original goal was to obstruct automated analysis, eg. running 'strings' on the binary to find strings they don't already know about. Here it seems that HexRays decompiler can see through the obfuscation if you've already found the function you're looking for. The intent is to make it harder to use automated tools to detect the presence of specific black listed strings in the binary.

You can always manually analyze the specific place where the string obfuscation is employed to decode the original string, this is why it's just an obfuscation, not actually encryption.

That all said, having HexRays trivially see through the obfuscation is perhaps a bit too easy and the obfuscation should be improved to require a little more effort on the reverse engineering side. Perhaps putting the obfuscated part in the r/w section can make IDA a bit more careful around assuming the data is constant. Another idea is to add an extra fixed XOR based data in r/w section.

About Xref breaking: for this to be fully effective the pointer should be offsetted in one function and fixed in another. If the operation is done inside the same function then any analysis should be trivially able to see through it. however making a separate function may aide in detecting and creation of tools to automate the decryption. For v0.2 I specifically removed the xref breaking relying more on the fact that the compiler can mix the deobfuscation code with the surrounding code making analysis harder.

In the end any kind of manual analysis of the obfuscation shall be able to reveal the obfuscated strings. As stated at the start the goal should be to make it harder to create tools which automatically discover all obfuscated strings in a binary. I don't expect this to be impossible but I'll try to make it harder.

Answer 2 · 2021-01-30T22:00:00.000Z

Hey I changed the obfuscation to put part of the obfuscated string in r/w section (through static mut). The latest 0.2.2 on crates.io has this change, let me know how well it works.

Answer 3 · 2021-02-01T23:08:43.000Z

much better, at least IDA doesn't directly show the strings directly anymore:

It's still a lot easier to reverse manually than the 0.1.x, mostly thanks to the x-refs, but automated tools based on IDA shouldn't be able to recover the strings anymore.

Answer 4 · 2021-02-07T03:42:57.000Z

I've introduced a generic xref obfuscation and applied it to the string obfuscation.

Example:

It this what you had in mind?

Additionally a generic API (obfstr::xref) is exposed to apply this obfuscation to other content in your application.