Dissasembling failing in xasm format
Vaipex opened this issue · 6 comments
Hi,
I'm currently trying to extract the bytecode, edit a few strings and assemble it back to a .pyc file.
Pydisasm without any flags work just fine but as soon as I try to dissamble the file with Pydisasm -F xasm ./file.pyc
it fails with the following traceback:
Traceback (most recent call last):
File "/usr/local/bin/pydisasm", line 33, in <module>
sys.exit(load_entry_point('xdis', 'console_scripts', 'pydisasm')())
File "/usr/lib/python3/dist-packages/click/core.py", line 1128, in __call__
return self.main(*args, **kwargs)
File "/usr/lib/python3/dist-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/usr/lib/python3/dist-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/lib/python3/dist-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/root/python-xdis/xdis/bin/pydisasm.py", line 72, in main
disassemble_file(path, sys.stdout, format)
File "/root/python-xdis/xdis/disasm.py", line 329, in disassemble_file
disco(
File "/root/python-xdis/xdis/disasm.py", line 160, in disco
disco_loop_asm_format(opc, version_tuple, co, real_out, {}, set([]))
File "/root/python-xdis/xdis/disasm.py", line 220, in disco_loop_asm_format
disco_loop_asm_format(
File "/root/python-xdis/xdis/disasm.py", line 220, in disco_loop_asm_format
disco_loop_asm_format(
File "/root/python-xdis/xdis/disasm.py", line 249, in disco_loop_asm_format
assert mapped_name not in fn_name_map
AssertionError
I also printed out the vars from the assert:
mapped_name='listcomp_0x7f3d301932f0'
fn_name_map={'listcomp_0x7f3d30192ff0': 'listcomp', 'listcomp_0x7f3d301932f0': 'listcomp'}
In order for me to work on, I'd need a complete short example with the pyc you started out with, the disassembly of that, the change to the assembly, and finally the resulting pyc. The shortest example that shows this is desirable.
I never got to the point of successfully disassembling the .pyc so its not newly assembled but here is one of the failing files.
Ah - I see what's up. If I or someone else doesn't answer this in a week or so, remind me.
alright, thank you!
Here is my understanding of the situation.
Some background first.
For each list comprehension that appears in Python code, a code object is created for the "body" of the code. For example if you write:
[x + 1 for x in collection]
Parts of the disassembly will look like:
# Source code size mod 2**32: 26 bytes
# Method Name: <module>
...
# Stack size: 2
# Flags: 0x00000040 (NOFREE)
# First Line: 1
# Constants:
# 0: <code object <listcomp> at 0x7fe0b8beb9f0, file "lc.py", line 1>
# 1: '<listcomp>'
# 2: None
# Names:
# 0: __file__
1: 0 LOAD_CONST (<code object <listcomp> at 0x7fe0b8beb9f0, file "lc.py", line 1>)
...
# Method Name: <listcomp>
...
1: 0 BUILD_LIST 0
2 LOAD_FAST (.0)
The function or method named <listcomp>
is created for the part of the source code x + 1
If there is another list comprehension , another code object with the same method named <listcomp>
is created.
The way the disassembler disambiguates the different <listcomp>
methods is to append the hex address, e.g.0x7fe0b8beb9f0 to the end of the name.
Apparently there are two listcomp methods with the same name including the hex address.
I understand how that is possible, but apparently it is.
I believe a simple workaround is to run the disassembler with a Python interpreter that matches the bytecode inside the bytecode.
When that is done, instead of xdis' structure for a code object, the "native" structure of the code object is used, I think no name mapping is needed.
I could be wrong here though.