the difference between Microcode Explorer output and optblock_t::func callback dump
TakahiroHaruyama opened this issue · 4 comments
I'm implementing control flow unflattening in more matured level, related to #7.
I like to debug the code by using Microcode Explorer graph but sometimes (especially in MMAT_GLBOPT1) the output generated by Microcode Explorer is different from optblock_t::func callback dump in the same maturity level (e.g.. dumpBefore-MMAT_GLBOPT1-0.txt), so I can't refer to the graph in debugging.
Do you know the reason?
No, I don't know what you're talking about.
Sorry, I attach one example graph of a function in MMAT_GLBOPT1 generated by Microcode Explorer.
As we can see, the control dispatcher block ID is 14.
On the other hand, according to the information dumped by optblock_t::func callback in the same level, the dispatcher ID is 9.
9. 0 ; 2WAY-BLOCK 9 INBOUNDS: 1 6 7 2 8 12 13 4 5 OUTBOUNDS: 10 14 [START=73F4211A END=73F42122] MINREFS: STK=24/ARG=128, MAXBSP: 0
9. 0 ; USE: edx.4
9. 0 ; VALRANGES: edx.4:(==251E6FCF|==6A786FA9|==A39DE200|==B8230B61|==D5FFDD16|==E0408B29|==E41FBF89|==F5AE3BEE)
9. 0 jle edx.4, #0xF5AE3BED.4, @14 ; 73F42120 u=edx.4
9. 0
So I'd like to know why there is a difference between them.
As it happens, I can sort of answer this question, but only because I've been reverse engineering Hex-Rays. This question isn't really related to an issue with the code I released for this project. It's pretty much just a generic question about Hex-Rays internals. You'll probably get better answers from Hex-Rays support.
Basically, the microcode explorer works by calling gen_microcode
to produce an mbl_array_t
at the specified maturity level, and then it stops immediately once that maturity level is reached. However, in ordinary operation of the decompiler, once the mbl_array_t
has reached MMAT_GLBOPT1
, Hex-Rays continues to optimize and transform the mbl_array_t
before it reaches the next maturity level, MMAT_GLBOPT2
.
In particular, in Hex-Rays 7.1, after reaching MMAT_GLBOPT1
, the decompiler resolves stack variable addresses, refines the input arguments sizes, triggers block combination, performs common subexpression elimination, preallocates local variables, and does some other stuff that I haven't reverse engineered yet. Only after all of this is done does the decompiler update the maturity level of the mbl_array_t
to MMAT_GLBOPT2
. Your optblock_t
handler is getting called somewhere after it reached MMAT_GLBOPT1
, but before it has reached MMAT_GLBOPT2
.
So, the reason the microcode explorer shows different results than something you dumped in an optblock_t
handler is that, by the time your optblock_t handler is called, the mbl_array_t
is not in the same state as it was when it originally reached MMAT_GLBOPT1
-- further transformations have taken place since reaching MMAT_GLBOPT1
.
Thank you so much!
And I understood I should ask to the support about the internal issue.