/ida-subdumper

[Deprecated, read README.md for a better solution] Dump subroutines from LST file produced by IDA Pro.

Primary LanguageGoApache License 2.0Apache-2.0

ida-subdumper

[Deprecated] Please refer to [Get pseudo code/project from binary with DWARF debug info] for a better solution.

Dump subroutine pseudocodes from LST file produced by IDA Pro.

  1. Open IDA Pro and load the binary file, choose import file names/line numbers (DWARF info found -> Import file names/line numbers), and analyze the binary.
  2. Wait for the analysis to finish, and generate the LST file (File -> Produce file -> Create LST file...).
  3. Run go run . {target} with the target binary file (the path to the LST file is {target}.lst), and {target}.sources.json will be generated.
  4. Edit script subdumper.py (replace {target} with the target binary file, and define your custom filter function), and load it into IDA Pro (File -> Script file...), and the pseudocodes grouped by source files will be dumped into the {target}.sources/ directory.

Get pseudo code/project from binary with DWARF debug info

Requirements

  • A binary with DWARF debug information.
  • A tool or library for parsing DWARF debug information. (e.g. go-dwarf)
  • A tool or library for disassembling binary and generating pseudocode. (e.g. IDA Pro, Ghidra, retdec)

Steps by steps

  1. Parse DWARF debug info and get line entries with the following information: address, file, line, column, discriminator. (e.g. line_test.go)
  2. Group line entries by file and sort them by address.
  3. Load binary into the disassembler (no need to analyze or produce LST file).
  4. Parse addresses and locate function boundaries.
  5. Generate pseudocodes for each function and save them into files in line order.
  6. Don't forget to produce the C headers file for structures.

Cheatsheet for IDAPython

Get a function

f = ida_funcs.get_func(ea)
print("%x %x" % (f.start_ea, f.end_ea))
print(ida_funcs.get_func_name(ea)) # not necessarily the start ea

for ea in Functions():
    print("%x" % ea)

Decompile and get pseudocode

def decompile(addr):
    try:
        cfunc = idaapi.decompile(addr)
    except idaapi.DecompilationFailure:
        cfunc = None
    if cfunc is None:
        return None
    lines = cfunc.get_pseudocode()
    retlines = []
    for lnnum in range(len(lines)):
        retlines.append(idaapi.tag_remove(lines[lnnum].line))
    return '\n'.join(retlines)