sdasgup3/validating-binary-decompilation

Thesis

sdasgup3 opened this issue · 5 comments

Thesis

Really cool project Sandeep!

For reference, have you come across this paper? I think they are trying to solve a similar problem as you, so may give some inspiration and insights.

Edit2: I just noticed you were already aware of this paper (per prelim-presentation) :)

Edit: You may also be interested in this paper (which happens to be written by my supervisor Roberto Guanciale, @guancio):

@mewmew Thanks for sharing such pretty good reads. I was well aware of the first paper. The second one is pretty enlightening.

@mewmew Thanks for sharing such pretty good reads. I was well aware of the first paper. The second one is pretty enlightening.

More than happy to.

From the README:

Immediate Goal

Extending the methodology to other lifters like fcd/rev.ng

How far along is this work? I'd love to stress test the generality of the approach taken by single_instruction_translation_validation as I am currently implementing a binary lifter from x86 to LLVM IR.

@mewmew Awesome!!
I have not started it yet, but aiming for it. I have a decent idea of where exactly the tool needs to change to accommodate other lifters (fcd or your-lifter). Happy to discuss more (sdasgup3@illinois.edu).

The single instruction validation uses the individual liftings generated by lifter and validates X86 Vs LLVM IR by comparing the corresponding symbolic summaries. Having said that, it would be better to have a look at the LLVM IR that you generate, to begin with. That is mainly because the LLVM semantics that we are currently borrowing has some limitations on FP/vector/intrinsic support. For the general-purpose instruction, your lifter may not generate them anyways (except llvm-popcnt to compute parity flag, which the LLVM semantics do have the support for).

@mewmew Awesome!!
I have not started it yet, but aiming for it. I have a decent idea of where exactly the tool needs to change to accommodate other lifters (fcd or your-lifter). Happy to discuss more (sdasgup3@illinois.edu).

The single instruction validation uses the individual liftings generated by lifter and validates X86 Vs LLVM IR by comparing the corresponding symbolic summaries. Having said that, it would be better to have a look at the LLVM IR that you generate, to begin with. That is mainly because the LLVM semantics that we are currently borrowing has some limitations on FP/vector/intrinsic support. For the general-purpose instruction, your lifter may not generate them anyways (except llvm-popcnt to compute parity flag, which the LLVM semantics do have the support for).

Wonderful! I'd be happy to discuss this in more detail. I'll throw you an e-mail and we can continue discussions there.

Cheers,
Robin