patchelf breaks dylibs from recent Firefox Nightly builds
K900 opened this issue · 11 comments
Describe the bug
After using patchelf --set-rpath
on a library from a recent Firefox Nightly build, the library can no longer be loaded because it segfaults the linker.
Steps To Reproduce
- Download and unpack https://download-installer.cdn.mozilla.net/pub/firefox/nightly/latest-mozilla-central/firefox-119.0a1.en-US.linux-x86_64.tar.bz2
- Attempt to dlopen any of the .so files (
libmozsqlite3.so
was my test target), e.g. withpython -c 'import ctypes; ctypes.cdll.LoadLibrary("./libmozsqlite3.so")'
- Observe success
patchelf --set-rpath "test" ./libmozsqlite3.so
- Attempt
dlopen
again - Segfault
Expected behavior
No segfault.
patchelf --version
output
Attempted both default nixpkgs 0.15.0 and current nixpkgs patchelfUnstable
(c401289).
Additional context
This seems to have been caused by upstream enabling some kind of advanced linker wizardry called "relrhack": https://hg.mozilla.org/mozilla-central/rev/032b87ff55061bcbdc7a85d9e18fde814797073a
The last build before that commit works fine.
The problem is that the source code uses the _DYNAMIC symbol, which translates to the binary code accessing the .dynamic section at a fixed address. But patchelf moves it, and puts something else where it used to be, so the code reads garbage.
Well this is fun. So I guess we need a custom fixup for this...
Smaller (independent) reproducer:
#include <stdio.h>
#include <elf.h>
extern Elf64_Dyn _DYNAMIC[];
int main() {
for (Elf64_Dyn* dyn = _DYNAMIC; dyn->d_tag != DT_NULL; dyn++) {
printf("%lx %p\n", dyn->d_tag, dyn->d_un.d_ptr);
}
return 0;
}
- compile with
gcc -o test test.c
- run
./test
patchelf --set-path foo test
- run again. It will show garbage.
Yeah sounds like we just have to special case that symbol. Not that it's not already special cased by the linker...
The symbol is not used in Firefox's case. It uses the address directly.
Actually, even in the small reproducer, the symbol is not used at runtime.
So I guess we have two issues here - we still need to handle _DYNAMIC correctly AND we need to figure out what to do about Firefox...
Actually, removing https://github.com/NixOS/patchelf/blob/master/src/patchelf.cc#L674 makes it work, because patchelf doesn't actually put another section where .dynamic used to be. It only overwrites its content with garbage.
Oh, I actually thought we moved the sections properly and was going to try this as a workaround tomorrow.
What is the original reasoning behind overwriting the old sections with Z
s? Just to reduce confusion?
The no-clobber workaround seems less than ideal, since it means that code referencing _DYNAMIC
is using an old copy of the dynamic table, which is likely to be different from the dynamic table in the new PT_DYNAMIC
segment.
The relrhack is in https://github.com/mozilla/gecko-dev/blob/58c532751054863dbb9d277051d63e1e7e77929e/build/unix/elfhack/inject.c#L184. This could be changed to use __ehdr_start
and e_phoff
to find the PT_DYNAMIC
program header (the same function already does this to find PT_GNU_RELRO
).