Regression: build failure on i686-pc-windows-gnu due to multiple definition of `_alloca`
jeremyd2019 opened this issue ยท 15 comments
The changes in #575 seem to have broken MSYS2's build of i686-pc-windows-gnu:
= note: D:/M/msys64/mingw32/bin/../lib/gcc/i686-w64-mingw32/13.2.0/../../../../i686-w64-mingw32/bin/ld.exe: D:/M/msys64/mingw32/bin/../lib/gcc/i686-w64-mingw32/13.2.0/libgcc.a(_chkstk.o):(.text+0x0): multiple definition of `_alloca'; C:\_\B\src\MINGW32\build\i686-pc-windows-gnu\stage0-sysroot\lib\rustlib\i686-pc-windows-gnu\lib/libstd-394a669d59370585.dll.a(std_394a669d59370585_dll_d001695.o):(.text+0x0): first defined here
Specifically, the latest update renaming __alloca to _alloca seems to have caused this error. The version of the changes from commit 67c1c0a (before the last force-push on that PR) work in both i686-pc-windows-gnu and i686-pc-windows-gnullvm.
I'm opening this as a new issue in hopes that that increases visibility. Hopefully somebody else can also verify that this breaks the 'offical' build of i686-pc-windows-gnu as distributed by rust-lang.org.
Originally posted by @jeremyd2019 in #575 (comment)
Failed also on Rust's CI: rust-lang/rust#123426
= note: C:/a/rust/rust/mingw32/bin/../lib/gcc/i686-w64-mingw32/12.2.0/../../../../i686-w64-mingw32/bin/ld.exe: C:/a/rust/rust/mingw32/bin/../lib/gcc/i686-w64-mingw32/12.2.0/libgcc.a(_chkstk.o):C:/buildroot/src/gcc-12.2.0/libgcc/config/i386/cygwin.S:81: multiple definition of `_alloca'; C:\a\rust\rust\build\i686-pc-windows-gnu\stage0-sysroot\lib\rustlib\i686-pc-windows-gnu\lib/libstd-3e85246dfcced244.dll.a(std_3e85246dfcced244_dll_d001834.o):(.text+0x0): first defined here
"good", in that it's readily reproducible.
Given that the previous __alloca symbol was basically not used and weak linking does not help, I think I'm going to open a PR to guard the _alloca symbol only for target_abi = "llvm". Does anyone know a better solution?
Alternative (untested) solutions:
- Add
_allocatoLIB1ASMFUNCS
Disadvantage: requires a new/patched GCC. - Remove
-lgccfromlate_link_args
Disadvantage: issue might reoccur if someone statically links a Rust library to another library that was linked tolibgcc.a.
I'll try invoking the expert: @mstorsjo, any thoughts?
I am still not entirely clear on the use case for this (at least on windows-gnu*). It seems libgcc is always linked in for -gnu, which obviously contains _alloca. compiler-rt is linked in for windows-gnullvm by virtue of using clang as the linker, right? Or if this crate is the only source of these symbols there, how did it ever work with the mis-named __alloca?
- Remove
-lgccfromlate_link_args
Disadvantage: issue might reoccur if someone statically links a Rust library to another library that was linked tolibgcc.a.
GCC always adds -lgcc, so wouldn't that mean linking static Rust libraries to anything with GCC would break?
guard the
_allocasymbol only fortarget_abi = "llvm"
Does this work now? I know we have issues with checking target_abi in stable rust, but that was in Cargo.toml so may be a different situation. I believe mati865 said that target_abi is supposed to be stabilized in 1.78?
I think I previously said that this wouldn't work for us (in msys2) because we are still patching rust to turn i686-pc-windows-gnu into something that acts like -gnullvm but is still called -gnu. Thinking about this more, now I think this would be OK, it would exclude _alloca but we know everything works without it because there was no _alloca provided by this crate previously (only __alloca).
I'll try invoking the expert: @mstorsjo, any thoughts?
Sorry, I don't quite know the whole picture of what's going on here, but I'll try to add some comments/observations.
I am still not entirely clear on the use case for this (at least on windows-gnu*). It seems libgcc is always linked in for -gnu, which obviously contains
_alloca. compiler-rt is linked in for windows-gnullvm by virtue of using clang as the linker, right?
In the C world, when linking with GCC, it always links in libgcc. When linking with clang (in mingw mode), it either links in libgcc or compiler-rt builtins. So the object files providing this symbol should always be included. If linked with -nodefaultlibs or -nostdlib or similar, it is omitted though.
Having both may or may not end up in a conflict. If the user provided object (where I guess the rust files are?) files contains _alloca, this should satisfy most dependencies for that symbol (I think, at least based on how lld resolves symbols), so nothing would pull in that symbol from libgcc/compiler-rt at all, and the conflict might be averted. If the object file from libgcc/compiler-rt gets pulled in for another reason (where the reason can be linking with --whole-archive or similar, or that this object file provides some other symbol, that is needed), then the conflict appears again.
I think it was determined that the object file in libgcc where _alloca is defined is being pulled in by virtue of defining another needed symbol. At least that's the assumption I've been under.
$ nm -B _chkstk.o
00000000 b .bss
00000000 d .data
00000000 t .text
00000000 T ___chkstk
00000000 T __alloca
Also, another possible complicating factor is that in rust's case, this crate seems to end up provided by a shared library (DLL), so we're actually dealing with symbols coming from an import library (libstd-394a669d59370585.dll.a) vs those coming from a static library (libgcc.a)
What if a __chkstk function was brought back, that is an alias to (or jmps to) _alloca, to match libgcc? Seems like that would prevent the libgcc object file from being pulled in.
For fun, trying to see where the ___chkstk reference is coming from:
/d/M/mingw-w64-rust/src/MINGW32/build/i686-pc-windows-gnu/stage0-rustc/i686-pc-windows-gnu/release/deps
$ for f in *.rlib; do nm -B $f | grep ___chkstk && echo $f; done
00000000 r .rdata$.refptr.___chkstk
00000000 r .rdata$.refptr.___chkstk_ms
00000000 R .refptr.___chkstk
00000000 R .refptr.___chkstk_ms
U ___chkstk
U ___chkstk_ms
librustc_llvm-49f9d8b394cf6fe3.rlib
$ mkdir foo
$ cd foo
$ ar xo ../librustc_llvm-49f9d8b394cf6fe3.rlib
$ for f in *.obj; do nm -B $f | grep ___chkstk && echo $f; done
00000000 r .rdata$.refptr.___chkstk
00000000 r .rdata$.refptr.___chkstk_ms
00000000 R .refptr.___chkstk
00000000 R .refptr.___chkstk_ms
U ___chkstk
U ___chkstk_ms
DynamicLibrary.cpp.obj
This seems to be coming from the static libLLVMSupport.a
What if a
__chkstkfunction was brought back, that is an alias to (or jmps to)_alloca, to match libgcc? Seems like that would prevent the libgcc object file from being pulled in.
Indeed, I added the following to src/x86.rs (at the start of the intrinsics! { block), and it is going further. I will see if the full rust toolchain build succeeds with this, and if so if it also works on gnullvm.
#[naked]
#[cfg(all(
windows,
target_env = "gnu",
not(feature = "no-asm")
))]
pub unsafe extern "C" fn __chkstk() {
core::arch::asm!(
"jmp __alloca", // Jump to __alloca since fallthrough may be unreliable"
options(noreturn, att_syntax)
);
}Yes both i686-pc-windows-gnu and gnullvm build all the way through with that addition.
I've tracked down where the reference to ___chkstk is coming from... https://github.com/llvm/llvm-project/blob/ccc02563f4d620d4d29a1cbd2c463871cc54745b/llvm/lib/Support/DynamicLibrary.cpp#L153-L157 calls https://github.com/llvm/llvm-project/blob/ccc02563f4d620d4d29a1cbd2c463871cc54745b/llvm/lib/Support/Windows/DynamicLibrary.inc#L174-L200 which references __chkstk due to https://github.com/llvm/llvm-project/blob/ccc02563f4d620d4d29a1cbd2c463871cc54745b/llvm/lib/Support/Windows/explicit_symbols.inc#L16-L18
What if a
__chkstkfunction was brought back, that is an alias to (or jmps to)_alloca, to match libgcc? Seems like that would prevent the libgcc object file from being pulled in.
This does sound like a reasonable solution, and it sounds like it works well in practice.
Overall, having a toolchain library object file supply more than one symbol, has been a cause for such conflicts many times. See mstorsjo/llvm-mingw#397 for a very similar issue recently, where we had two __chkstk style symbols in compiler-rt, wine provided one of them themselves, but needed to link in the other one. We fixed that by removing the extra symbol from compiler-rt in llvm/llvm-project@885d7b759b5c166c07c07f. If weโd really need it, we should provide it in a separate object file, so the linker can take one but skip the other.
edit: Thatโs not the commit I was thinking about. I actually meant llvm/llvm-project@248aeac.
What if a
__chkstkfunction was brought back, that is an alias to (or jmps to)_alloca, to match libgcc? Seems like that would prevent the libgcc object file from being pulled in.
This also seems like a reasonable solution to me. Feel free to open a PR for that. ๐