MachO fails to parse 10.8 64-bit Python3 modules
ronaldoussoren opened this issue · 4 comments
Original report by Andrew Barnert (Bitbucket: abarnert, GitHub: abarnert).
The simplest repeatable test case I've got:
On a 10.8.2 machine, use Homebrew to install python3 version 3.2.3, then pip-3.2 install appscript, and build the latest macholib. Then:
#!python
>>> import macholib.MachO
>>> import aem
>>> macholib.MachO.MachO(aem.ae.__file__)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/Cellar/python3/3.2.3/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/macholib/MachO.py", line 69, in __init__
self.load(fp)
File "/usr/local/Cellar/python3/3.2.3/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/macholib/MachO.py", line 84, in load
self.load_header(fh, 0, size)
File "/usr/local/Cellar/python3/3.2.3/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/macholib/MachO.py", line 114, in load_header
hdr = MachOHeader(self, fh, offset, size, magic, hdr, endian)
File "/usr/local/Cellar/python3/3.2.3/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/macholib/MachO.py", line 154, in __init__
self.load(fh)
File "/usr/local/Cellar/python3/3.2.3/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/macholib/MachO.py", line 233, in load
cmd_data = fh.read(data_size)
File "/usr/local/Cellar/python3/3.2.3/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/macholib/util.py", line 100, in read
raise ValueError("Invalid size %s while reading from %s", size, self._fileobj)
ValueError: ('Invalid size %s while reading from %s', -8, <_io.BufferedReader name='/usr/local/Cellar/python3/3.2.3/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/aeosa/aem/ae.so'>)
The same problem happens in Python 2, as long as you're inspecting a Python 3 .so:
#!python
>>> import macholib.MachO
>>> macholib.MachO.MachO('/usr/local/Cellar/python3/3.2.3/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/aeosa/aem/ae.so')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Python/2.7/site-packages/macholib/MachO.py", line 69, in __init__
self.load(fp)
File "/Library/Python/2.7/site-packages/macholib/MachO.py", line 84, in load
self.load_header(fh, 0, size)
File "/Library/Python/2.7/site-packages/macholib/MachO.py", line 114, in load_header
hdr = MachOHeader(self, fh, offset, size, magic, hdr, endian)
File "/Library/Python/2.7/site-packages/macholib/MachO.py", line 154, in __init__
self.load(fh)
File "/Library/Python/2.7/site-packages/macholib/MachO.py", line 233, in load
cmd_data = fh.read(data_size)
File "/Library/Python/2.7/site-packages/macholib/util.py", line 100, in read
raise ValueError("Invalid size %s while reading from %s", size, self._fileobj)
ValueError: ('Invalid size %s while reading from %s', -8L, <closed file '/usr/local/Cellar/python3/3.2.3/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/aeosa/aem/ae.so', mode 'rb' at 0x108c438a0>)
Doing the same thing, in either Python 2 or 3, against the Python 2 aem.ae.so seems to work fine.
I've seen the same thing with one of my three custom modules, and at least one other third-party module, but others work fine.
It doesn't seem to matter whether macholib is built or runs on 10.7 vs. 10.8, but it may matter whether the so it's dealing with was built on 10.7 vs. 10.8 (I haven't tested enough to be sure).
Original comment by Andrew Barnert (Bitbucket: abarnert, GitHub: abarnert).
The problem seems to be that there's an LC_DATA_IN_CODE that's only 16 bytes, which isn't enough room for a dylib_command (16 bytes) and a load_command (8 bytes). So, we end up trying to read -8 bytes, which is what fails.
And even if we skip that, we've apparently already read 8 bytes too far, so we're out of sync.
I can hack around this by, at MachO.py:228, adding this:
#!python
elif (cmd_load.cmd == LC_DATA_IN_CODE and
cmd_load.cmdsize == sizeof(klass)):
cmd_data = []
fh.seek(-8, 1)
I'm sure this isn't the right solution; I assume we actually need to avoid trying to read a load command. I don't really understand what's supposed to be in LC_DATA_IN_CODE; the only thing I can find anywhere is "table of non-instructions in __text". Every example I can find has just type 0x29, size 16, and then right on to the next command.
According to http://prod.lists.apple.com/archives/darwin-kernel/2012/Sep/msg00025.html this is actually related to building with Xcode 4.5, not to building on Mountain Lion. (And it sounds like it should be related to building for Mountain Lion, but there's a bug, at least for his 32-bit kext.)
I can't find any documentation on the LC_DATA_IN_CODE command besides the comment in the xnu headers "table of non-instructions in __text".
As a side note, util.py:100 (in fileview.read) should be:
#!python
raise ValueError("Invalid size %s while reading from %s" % (size, self._fileobj))
Original comment by Andrew Barnert (Bitbucket: abarnert, GitHub: abarnert).
This is probably a better fix. It's still blind duct-taping, but at least it avoids seeking backward in the file, and (I think) would properly handle cases where an LC_DATA_IN_CODE isn't empty.
At MachO.py:183:
#!python
if cmd_load.cmd == LC_DATA_IN_CODE:
# data is not a load command; it's a "table of
# non-instructions in __text"; just skip everything
data_size = cmd_load.cmdsize - sizeof(cmd_load)
if data_size != 0:
cmd_data = fh.read(data_size)
cmd.append((cmd_load, None, []))
read_bytes += cmd_load.cmdsize
continue
Original comment by Ronald Oussoren (Bitbucket: ronaldoussoren, GitHub: ronaldoussoren).
Changeset c3e730437083 should fix this issue. The command definition for LC_DATA_IN_CODE in macho_o.py was wrong, I've fixed that definition.
See also https://bitbucket.org/ronaldoussoren/py2app/issue/56/assertion-error-with-latest-dev-build
Original comment by Andrew Barnert (Bitbucket: abarnert, GitHub: abarnert).
(Reply via abar...@yahoo.com):
Ah, I see what was going on now. That's what I get for not reading enough of the code...
Anyway, thanks for the quick fix.
Sent from my iPhone