angr/pyvex

Possibly incorrect lifting of ldrsw to VEX on arm64

vadimkotov opened this issue · 1 comments

Hi folks,

I think I've come across an inaccuracy when lifting the following piece of code from arm64 to VEX (unless I'm using it wrong):

00114d4c       ldrsw      x15, DAT_00114d70 

The ldrsw instruction is supposed to load a word (32-bit value) from the address and sign-extend it into x15 (see arm user guide).

However, this is what it is getting lifted to:

IRSB {
   t0:Ity_I64 t1:Ity_I64

   00 | ------ IMark(0x14d4c, 4, 0) ------
   01 | t0 = LDle:I64(0x0000000000014d70)
   02 | PUT(x15) = t0
   NEXT: PUT(pc) = 0x0000000000014d50; Ijk_Boring
}

When executed by SimulationManager, a 64-bit value gets loaded to x15 as opposed to a 32-bit one.

The binary I'm reverse engineering right now uses this instruction to dynamically calculate the branch address (as an obfuscation technique) and so it breaks all analyses.

For comparison, here is Ghidra's pcode which gets it right:

$U5490:4 = LOAD ram(0x114d70:8)
x15 = INT_SEXT $U5490:4

Here's the Python code used to reproduce the VEX output:

import pyvex
import archinfo
import capstone

code = b'\x2f\x01\x00\x98'
addr = 0x14d4c

irsb = pyvex.lift(code, addr, archinfo.ArchAArch64())
irsb.pp()

md = capstone.Cs(capstone.CS_ARCH_ARM64, capstone.CS_MODE_ARM)
for (address, size, mnemonic, op_str) in md.disasm_lite(code, addr):
    print("0x%x:\t%s\t%s" %(address, mnemonic, op_str))

Cheers,
Vadim

PS. I'm not sure if I'm at liberty to share the full binary, but aside from that I'll be happy to provide any additional information on the matter.

Hey! Sorry it took so long to get back to you. I've fixed libvex to lift the instruction correctly:

[-] In [3]: code = b'\x2f\x01\x00\x98'

[-] In [4]: addr = 0x14d4c

[-] In [5]: irsb = pyvex.lift(code, addr, archinfo.ArchAArch64())

[+] In [6]: irsb.pp()
IRSB {
   t0:Ity_I64 t1:Ity_I32 t2:Ity_I64

   00 | ------ IMark(0x14d4c, 4, 0) ------
   01 | t1 = LDle:I32(0x0000000000014d70)
   02 | t0 = 32Sto64(t1)
   03 | PUT(x15) = t0
   NEXT: PUT(pc) = 0x0000000000014d50; Ijk_Boring
}