indygreg/PyOxidizer

Panic signing a particular executable file

sfackler opened this issue · 8 comments

It seems like this particular binary causes apple-codesign to break for some reason:

signing cross-x86_64-centos-7-0.9.0-macos-aarch64/libexec/gcc/x86_64-linux-gnu/11.2.0/cc1plus in place
signing cross-x86_64-centos-7-0.9.0-macos-aarch64/libexec/gcc/x86_64-linux-gnu/11.2.0/cc1plus as a Mach-O binary
inferring default signing settings from Mach-O binary
preserving existing binary identifier in Mach-O
preserving code signature flags in existing Mach-O signature
setting binary identifier to cc1plus
parsing Mach-O
signing Mach-O binary at index 0
thread 'main' panicked at 'assertion failed: segment.fileoff == 0 || segment.fileoff == cursor.position()', /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/apple-codesign-0.16.0/src/macho_signing.rs:166:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Here's the offending binary in a tarball: cc1plus.tar.gz

Thanks for the reproduce case.

The assertion is triggered when writing the __LINKEDIT segment, which is the final segment in the binary. Its original file offsets are 0x1a88000-0x1c21c44. However, when the writer gets to the segment, it is only at offset 0x1a84000. This discrepancy of exactly 16,384 bytes triggers the assertion.

Using rcodesign extract --data macho-segments cc1plus:

segments count: 5
segment #0; __PAGEZERO; offsets=0x0-0x0; vm/file size 4294967296/0; section count 0
segment #1; __TEXT; offsets=0x0-0x18d4000; vm/file size 26034176/26034176; section count 8
segment #1; section #0: __text; segment offsets=0xae0-0x13122c0 size 19994592
segment #1; section #1: __stubs; segment offsets=0x13122c0-0x1312bcc size 2316
segment #1; section #2: __stub_helper; segment offsets=0x1312bcc-0x131340c size 2112
segment #1; section #3: __cstring; segment offsets=0x131340c-0x148fe8f size 1559171
segment #1; section #4: __const; segment offsets=0x148fe90-0x18bbaa5 size 4373525
segment #1; section #5: __ustring; segment offsets=0x18bbaa6-0x18bbaf4 size 78
segment #1; section #6: __unwind_info; segment offsets=0x18bbaf4-0x18d3e64 size 99184
segment #1; section #7: __eh_frame; segment offsets=0x18d3e68-0x18d3fec size 388
segment #2; __DATA_CONST; offsets=0x18d4000-0x1a74000; vm/file size 1703936/1703936; section count 3
segment #2; section #0: __got; segment offsets=0x18d4000-0x18d58d0 size 6352
segment #2; section #1: __mod_init_func; segment offsets=0x18d58d0-0x18d59e0 size 272
segment #2; section #2: __const; segment offsets=0x18d59e0-0x1a72fa8 size 1693128
segment #3; __DATA; offsets=0x1a74000-0x1a84000; vm/file size 1075249152/65536; section count 4
segment #3; section #0: __la_symbol_ptr; segment offsets=0x1a74000-0x1a74608 size 1544
segment #3; section #1: __data; segment offsets=0x1a74608-0x1a81390 size 52616
segment #3; section #2: __bss; segment offsets=0x0-0x40048658 size 1074038360
segment #3; section #3: __common; segment offsets=0x0-0x115198 size 1135000
segment #4; __LINKEDIT; offsets=0x1a88000-0x1c21c44; vm/file size 1678404/1678404; section count 0

We see that the __DATA segment ends at 0x1a84000 and the __LINKEDIT segment begins at 0x1a88000. So whatever wrote this file (the load commands say ld from macOS SDK 11.3 - so Apple's build of llvm lld I think) left a 16kb gap between segments.

All the Mach-O binaries that I've seen so far don't leave a gap between segments in the file! Yet here we are with a Mach-O binary that has 16kb of 0s between the segments. The zero padding of segment / section data is using done within the declared segment boundaries, not by leaving a gap between the in-file segment offsets. Why the linker did this, I'm not sure. Probably a linker script or some linker argument forcing the padding.

Anyway, leaving a gap between segments is valid. So we should catch this and write out the padding bytes to preserve the original file.

Did you want to have a go at the fix? If not, I'll code it up.

Actually, I'll just code up a patch.

@sfackler how does commit 37a6da5 look to you?

The code signing process appears to succeed:

signing cc1plus in place
signing cc1plus as a Mach-O binary
inferring default signing settings from Mach-O binary
preserving existing binary identifier in Mach-O
preserving code signature flags in existing Mach-O signature
setting binary identifier to cc1plus
parsing Mach-O
signing Mach-O binary at index 0
padding 16384 bytes before segment __LINKEDIT
binary targets macOS >= 10.16.0 with SDK 11.3.0
adding code signature flags from signing settings: ADHOC | LINKER_SIGNED
creating ad-hoc signature
removing linker signed flag from code signature (we're not a linker)
code directory version: 132096
total signature size: 228244 bytes
padding 16384 bytes before segment __LINKEDIT
writing Mach-O to cc1plus

But the binary still gets sigkilled when I run it. Relevant logs from Console.app:

CODE SIGNING: cs_invalid_page(0x1460e8000): p=23503[cc1plus] final status 0x23000200, denying page sending SIGKILL
CODE SIGNING: process 23503[cc1plus]: rejecting invalid page at address 0x1460e8000 from offset 0x1aa8000 in file "/Users/sfackler/cc1plus" (cs_mtime:1659445790.0 == mtime:1659445790.0) (signed:1 validated:1 tainted:1 nx:0 wpmapped:0 dirty:0 depth:0)

Signing with Apple's codesign does work (via codesign -s - cc1plus) so I think the padding logic may not be quite what the verification logic is expecting.

If it's relevant, the binary was created by Clang 11 on Linux via osxcross.

That's what I get for authoring the change on Linux and not testing :)

The code signature contains content digests over chunks/windows of the Mach-O. These chunks stop prematurely at segment boundaries (presumably so the loader can easily map / verify entire segments).

The bug here is likely that our digesting code isn't digesting the empty space between the segments. We'll need to teach it to do so. (Although I need to verify this by comparing signatures with Apple's tooling.)

Looks like my theory about not including space between segments in the digests is correct. That's the good news.

The bad news is the API around computing the segment windows to digest operates against a goblin::MachO instance and this type doesn't allow direct access to the underlying Mach-O data to get the actual data between the segments. (You only get a reference to data within segments.) So it will be a bit of a yak shave to refactor the code to thread the original file data through.

Maybe this is the opportunity I was looking for to migrate to the object crate...

Something else wonky is going on here.

There's a field in the embedded code signature that effectively states the total number of bytes being encapsulated by paged content digests. My belief is that this is the number of on-file bytes between the beginning of the file and the code signature. Essentially, the file-level offset of the code signature. Every Mach-O I've seen so far is this way.

However, the binary you provided is 29,498,436 bytes but this offset reported in the code signature is 36,327,264, past the end of the file.

Moreover, your binary is reporting 8,869 4096 byte digests being recorded. Multiply those numbers together and you get 36,327,424, which is near the reported size of the signed code size.

If I run codesign -v cc1plus it reports cc1plus: invalid signature (code or signature have been modified). So it looks like the file uploaded to this issue (sha256:757c942b6989e6dbd9f5ddfcfef856ecf91f43036d57c018cb2ad899c9224f8a) is incomplete somehow. It kinda feels like maybe an adhoc signature was created by the linker and the file was subsequently stripped/rewritten? Whatever did the rewriting was smart enough to update the Mach-O load commands to reflect the new offset of the __LINKEDIT segment but not smart enough to update the code signature within __LINKEDIT. And I further postulate this tool was the one inserting the space between segments, as I've never seen lld do it. (But I'm pretty sure you can coerce it to do it.)

Anyway, I think I'm close to a fix here. I just wanted to report the wonkiness with the sample binary.

Interesting - I can see a couple of possibilities there:

  1. The uploaded binary is from after the failed rcodesign run. I'm not sure when the assert fired in comparison to any in-place edits but it could in theory have gotten partway through modifying the binary.
  2. While the macOS LLD will automatically codesign binaries, the LLD from "normal clang" that osxcross uses will not. I wonder if some but not all of the signing code is still running to produce the weird state?
  3. This binary comes from building a GCC using Clang - the build process could be using some weird/obscure flags? Not sure why it'd only affect cc1plus and not gcc, g++ etc, though.