ida-subdumper
[Deprecated] Please refer to [Get pseudo code/project from binary with DWARF debug info] for a better solution.
Dump subroutine pseudocodes from LST file produced by IDA Pro.
- Open IDA Pro and load the binary file, choose import file names/line numbers (DWARF info found -> Import file names/line numbers), and analyze the binary.
- Wait for the analysis to finish, and generate the LST file (File -> Produce file -> Create LST file...).
- Run
go run . {target}
with the target binary file (the path to the LST file is{target}.lst
), and{target}.sources.json
will be generated. - Edit script
subdumper.py
(replace{target}
with the target binary file, and define your custom filter function), and load it into IDA Pro (File -> Script file...), and the pseudocodes grouped by source files will be dumped into the{target}.sources/
directory.
Get pseudo code/project from binary with DWARF debug info
Requirements
- A binary with DWARF debug information.
- A tool or library for parsing DWARF debug information. (e.g. go-dwarf)
- A tool or library for disassembling binary and generating pseudocode. (e.g. IDA Pro, Ghidra, retdec)
Steps by steps
- Parse DWARF debug info and get line entries with the following information: address, file, line, column, discriminator. (e.g. line_test.go)
- Group line entries by file and sort them by address.
- Load binary into the disassembler (no need to analyze or produce LST file).
- Parse addresses and locate function boundaries.
- Generate pseudocodes for each function and save them into files in line order.
- Don't forget to produce the C headers file for structures.
Cheatsheet for IDAPython
Get a function
f = ida_funcs.get_func(ea)
print("%x %x" % (f.start_ea, f.end_ea))
print(ida_funcs.get_func_name(ea)) # not necessarily the start ea
for ea in Functions():
print("%x" % ea)
Decompile and get pseudocode
def decompile(addr):
try:
cfunc = idaapi.decompile(addr)
except idaapi.DecompilationFailure:
cfunc = None
if cfunc is None:
return None
lines = cfunc.get_pseudocode()
retlines = []
for lnnum in range(len(lines)):
retlines.append(idaapi.tag_remove(lines[lnnum].line))
return '\n'.join(retlines)