Don't leak non-exported symbols from staticlibs
bjorn3 opened this issue · 33 comments
When compiling a cdylib only #[no_mangle] symbols are exported. #[rustc_std_internal_symbol] and mangled symbols are not exported. This prevents symbol conflicts and avoids overriding symbols in ways that causes UB when using a rust cdylib in a rust program. For staticlibs however all symbols leak out of the staticlib. Causing symbol overrides that are potentially UB and symbols conflicts. For example when statically linking spidermonkey, you will get symbol conflicts if the rust version you compile your program with doesn't match the one the rust parts of spidermonkey were compiled with.
cc rust-lang/wg-allocators#108 (comment)
cc https://bytecodealliance.zulipchat.com/#narrow/stream/206238-general/topic/.E2.9C.94.20spidermonkey-wasm-rs/near/309604553
cc @chorman0773
FTR, I'm unsure there is a way to limit exported symbols from a staticlib, without both limiting the number of CGUs within the crate to 1 and preventing any upstream crates from using the #[global_allocator]. There is further limited ways to prevent upstream crate symbols from being exported. My Original Post was made with that in mind.
Maybe partial linking would work? If not we will at least need to version all #[rustc_std_internal_symbol]s for each rustc version to prevent UB if different rustc versions implement them in backwards incompatible ways. And maybe we can edit every object file to rename symbols to be unique? By the way note that for cdylib and staticlib all rust crate dependencies are included directly in the cdylib/staticlib. No dynamic linking supported.
Maybe partial linking would work?
IDK whether all linkers support partial linking (in particular, I don't know about microsoft's link.exe). Though, personally, I don't particularily want to touch ld -r with a 10' pole anyways if I don't have to. The versioning would work, if done within the mangling (for mangled symbols).
By the way note that for cdylib and staticlib all rust crate dependencies are included directly in the cdylib/staticlib. No dynamic linking supported.
I plan to do it slightly differently when possible - for staticlibs resolve the dependencies and produce a links.o "object" that is just a linker script. For cdylibs, just link as normal and when dynamically linking, add DT_RPATH (to the stdlib directory)+DT_NEEDED as needed.
for staticlibs resolve the dependencies and produce a links.o "object" that is just a linker script.
That is a backward incompatible change for rustc
For cdylibs, just link as normal and when dynamically linking, add DT_RPATH (to the stdlib directory)+DT_NEEDED as needed.
That will share and thus leak the global state of libstd across the cdylib boundary. This among other things will break the mitigation of #102721 to prevent catching foreign rust panics.
rustc currently just emits the library directly into the archive, right?
I'm not particularly sure what the difference here is, except in terms of
file size, and the fact that using libstd.so.0.1 is an option in addition
to libstd.rlib.0.1.
The difference is that currently you can ship a staticlib as a standalone file and expect linking to succeed, but with your proposal you also need all dependencies to be available at exactly the same place.
That is true, though I'm unsure how it is avoidable - even when statically
linking the symbols from libstd et. al need to have default visibility so
that when linking into a dylib, the symbols can be used via the dylib. It
would not make a difference on ELF.
When linking a cdylib, libstd is statically linked and none of it's symbols are exported from the cdylib. When linking a rust dylib, sharing state is just fine. In fact you aren't allowed to duplicate crates in that case.
The difference is that currently you can ship a staticlib as a standalone file and expect linking to succeed, but with your proposal you also need all dependencies to be available at exactly the same place.
Fair enough, though the result is potentially shipping GiB for an API surface that should take MiB.
When linking a cdylib, libstd is statically linked and none of it's symbols are exported from the cdylib. When linking a rust dylib, sharing state is just fine. In fact you aren't allowed to duplicate crates in that case.
On ELF I'm unsure how this would be achieved. Symbol visibilty is controlled when the symbol is defined (in the object file), and I just send everything on to ld (post processing? What is that? I only know "add head and tail libraries to link line"). ELF shared objects don't have an "Export List", the dynamic symbol list is just built from the static symbol list usually excluding internal and hidden symbols. Every GLOBAL or WEAK symbol in the list with PROTECTED or DEFAULT visibility can be imported from the cdylib and every symbol with DEFAULT visibilty can additonally be replaced. These are functions of the link editor producing the files and the dynamic linker-loader resolving runtime relocations, and are far from under the control of rust as a language or any particular implementation.
Rustc passes a version script to the linker specifying exactly which symbols to export and making it to hide everything else. This is but one of the reasons rustc is in change of invoking the linker.
Rustc passes a version script to the linker specifying exactly which symbols to export and making it to hide everything else. This is but one of the reasons rustc is in change of invoking the linker.
Ah. My problem is that I have to support link editors that are like "Version Script? What is this? Expected PHDRS, MEMORY, or SECTIONS"
Linker scripts and versions scripts are different. Linker scripts tell what should be put where in the linked artifact. Version scripts only list which symbols are exported and which aren't amd optionally provide a version for the purpose of symbol versioning. The format of version scripts is trivial in comparison to linker scripts. See
rust/compiler/rustc_codegen_ssa/src/back/linker.rs
Lines 666 to 724 in b7463e8
I am aware of version scripts. The simplicity is not the problem. The problem is if I'm faced with a link editor that doesn't support them, which I cannot always assume.
Which linker doesn't support it? AFAIK every platform targeted by rustc has a linker supporting them or some other way to hide symbols.
Hmm... I'm actually not sure. Some quick research on autoconf says only GNU ld + solaris LD (and lld supports it, as will lcld). I guess older platforms may have none of the above, but IDK how old you have to get. I'm sure given enough time I could find a counterexample, but I don't want to look rn.
Wrt. staticlibs, with the versioned symbols would it be permissible to have same/compatible versions of a rust compiler share things like the global_allocator between compiled staticlibs and final link targets (rather than rejecting with multidef errors)?
Won't that risk mixing alloc for one allocator with free for another allocator?
Symbol resolution may choose the __rust_alloc symbol from one allocator shim and the __rust_dealloc symbol from another allocator shim, right? Even if current linkers are likely to choose both from the same allocator shim, there is no guarantee that this will always be the case AFAIK.
I just checked if COMDAT is supported for Mach-O. It isn't: https://godbolt.org/z/f5o9Pj6rY
LLVM ERROR: MachO doesn't support COMDATs, 'foo' cannot be lowered.
I looked a while ago and they should be?
It should be possible the way C++ handles replacing operator new. I can't check how it works though on CE, though, and I Don't have a mac.
I couldn't find any references to COMDAT + Mach-O other than that error in the LLVM source code. For operator new is it possible that a weak symbol was used instead? Or does C++ not allow that?
Maybe, though vtables would also be found in COMDATs.
Mach-O apparently calls it a COALESCED section (https://opensource.apple.com/source/xnu/xnu-4903.221.2/EXTERNAL_HEADERS/mach-o/loader.h.auto.html Ctrl+F for S_COALESCED, https://github.com/aidansteele/osx-abi-macho-file-format-reference#fields-6 under flags, https://github.com/aidansteele/osx-abi-macho-file-format-reference#discussion-1).
Mach-O only supports 252 sections (1 through 253, 0 is for the current image, 254 is for undefined symbols, 255 for the executable image).
https://github.com/aidansteele/osx-abi-macho-file-format-reference
If this file is a two-level namespace image (that is, if the MH_TWOLEVEL flag of the mach_header structure is set), the high 8 bits of n_desc specify the number of the library in which this undefined symbol is defined. Use the macro GET_LIBRARY_ORDINAL to obtain this value and the macro SET_LIBRARY_ORDINAL to set it. Zero specifies the current image. 1 through 253 specify the library number according to the order of LC_LOAD_DYLIB commands in the file. The value 254 is used for undefined symbols that are to be dynamically looked up (supported only in OS X v10.3 and later). For plug–ins that load symbols from the executable program they are linked against, 255 specifies the executable image. For flat namespace images, the high 8 bits must be 0.
Lld seems to ignore S_COALESCED other than in a check if the section is a code section. If S_COALESCED behaved like COMDAT lld ignoring it would be incorrect. What I think is the case is that all weak symbols are put together in a single S_COALESCED section and then the linker coalesces every function individually rather than in a group like with COMDAT. I can't test my theory. If S_COALESCED is treated as COMDAT though, I still don't think it is very realistic to convince LLVM to support it as it would mean you can't have much more than 200 COMDAT groups in a single object file.
Mach-O only supports 252 sections (1 through 253, 0 is for the current image, 254 is for undefined symbols, 255 for the executable image).
Isn't this per-segment?
What I think is the case is that all weak symbols are put together in a single S_COALESCED section and then the linker coalesces every function individually rather than in a group like with COMDAT. I can't test my theory
In that case, that would mean that the definition of C++ vtables w/o a key function (All virtual functions are inline or inherited) would be invalid, at least under the Itanium C++ ABI which AFAIK apple clang (and clang on darwin) follows (RTTI and VTable definitions provided by the same TU).
In any case, coalesced sections with weak definitions seem to apply here: my proposal would be that the global_allocator (and other key symbols) defined by the binary/FLT are canonical, and the one in the staticlib goes goodbye.
In the alternative, it could be limited to having only one #[global_allocator] between them, and having only the default be overridden. This is the behaviour of lccc's current standard library, which makes only the stdlib definition of std::alloc::__global_allocator weak, and making the one provided by #[global_allocator] a normal definition.
Mach-O only supports 252 sections (1 through 253, 0 is for the current image, 254 is for undefined symbols, 255 for the executable image).
Isn't this per-segment?
I thought Mach-O object files (before linking) only allowed a single segment.
Given that text and data are separate segments pre-link, I'd doubt that.
https://github.com/aidansteele/osx-abi-macho-file-format-reference
The MH_OBJECT file type is the format used for intermediate object files. It is a very compact format containing all its sections in one segment. The compiler and assembler usually create one MH_OBJECT file for each source code file. By convention, the file name extension for this format is .o.
Yeah, I just saw that.
Side note, I would like to do the same thing with cdylib, though I can settle on just panic stuff, since my abi already requires me to support crossing any abi boundery that allows unwinding and treat it as native, even across abi versions - but that should have zero problem because of how panic unwinding is handled.
TBH, other than overriden global_allocator, I shouldn't have any problems with just treating cdylib like dylib entirely, stdlib stuff uses abi_tag in mangling except for fixed stuff which is superstable (if I need to change its behaviour in a new version, it gets a new name), and DT_SONAME is a powerful tool, at least on ELF (and I have other solutions for dll targets, and IDK Mach-O solution yet, but I assume it has some equivalent).
To plug the leak for 1.58.0, we're trying an ld -r based solution for Ruby's YJIT. It seems to do everything we want on Linux with the GNU linker. For macOS, hiding rust_eh_personality seems to cause an undefined symbol error even though the object file does define it. LLD links fine, though. We hide everything except rust_eh_personality with ld64. We don't support Windows at the moment.
I wrote a post which has more details in case anyone cares.
Great writeup @XrXr!
For macOS, hiding rust_eh_personality seems cause an undefined symbol error even though the object file does define it.
Interesting
We don't support Windows at the moment.
That makes solving this issue a lot easier 😆
By the way rustc uses version scripts to select which symbols should be exported when linking rather than objcopy --keep-global-symbol. I don't know if this works for partial linking and if it supports wildcards, but if it does, this should save some linking time and may allow the linker to do a better job at omitting unused functions if --gc-sections is used.