rust-lang/compiler-builtins

Regression: build failure on i686-pc-windows-gnu due to multiple definition of `_alloca`

jeremyd2019 opened this issue ยท 15 comments

The changes in #575 seem to have broken MSYS2's build of i686-pc-windows-gnu:

    = note: D:/M/msys64/mingw32/bin/../lib/gcc/i686-w64-mingw32/13.2.0/../../../../i686-w64-mingw32/bin/ld.exe: D:/M/msys64/mingw32/bin/../lib/gcc/i686-w64-mingw32/13.2.0/libgcc.a(_chkstk.o):(.text+0x0): multiple definition of `_alloca'; C:\_\B\src\MINGW32\build\i686-pc-windows-gnu\stage0-sysroot\lib\rustlib\i686-pc-windows-gnu\lib/libstd-394a669d59370585.dll.a(std_394a669d59370585_dll_d001695.o):(.text+0x0): first defined here

Specifically, the latest update renaming __alloca to _alloca seems to have caused this error. The version of the changes from commit 67c1c0a (before the last force-push on that PR) work in both i686-pc-windows-gnu and i686-pc-windows-gnullvm.

I'm opening this as a new issue in hopes that that increases visibility. Hopefully somebody else can also verify that this breaks the 'offical' build of i686-pc-windows-gnu as distributed by rust-lang.org.

Originally posted by @jeremyd2019 in #575 (comment)

Failed also on Rust's CI: rust-lang/rust#123426

    = note: C:/a/rust/rust/mingw32/bin/../lib/gcc/i686-w64-mingw32/12.2.0/../../../../i686-w64-mingw32/bin/ld.exe: C:/a/rust/rust/mingw32/bin/../lib/gcc/i686-w64-mingw32/12.2.0/libgcc.a(_chkstk.o):C:/buildroot/src/gcc-12.2.0/libgcc/config/i386/cygwin.S:81: multiple definition of `_alloca'; C:\a\rust\rust\build\i686-pc-windows-gnu\stage0-sysroot\lib\rustlib\i686-pc-windows-gnu\lib/libstd-3e85246dfcced244.dll.a(std_3e85246dfcced244_dll_d001834.o):(.text+0x0): first defined here

"good", in that it's readily reproducible.

Given that the previous __alloca symbol was basically not used and weak linking does not help, I think I'm going to open a PR to guard the _alloca symbol only for target_abi = "llvm". Does anyone know a better solution?

Alternative (untested) solutions:

I'll try invoking the expert: @mstorsjo, any thoughts?

I am still not entirely clear on the use case for this (at least on windows-gnu*). It seems libgcc is always linked in for -gnu, which obviously contains _alloca. compiler-rt is linked in for windows-gnullvm by virtue of using clang as the linker, right? Or if this crate is the only source of these symbols there, how did it ever work with the mis-named __alloca?

GCC always adds -lgcc, so wouldn't that mean linking static Rust libraries to anything with GCC would break?

guard the _alloca symbol only for target_abi = "llvm"

Does this work now? I know we have issues with checking target_abi in stable rust, but that was in Cargo.toml so may be a different situation. I believe mati865 said that target_abi is supposed to be stabilized in 1.78?

I think I previously said that this wouldn't work for us (in msys2) because we are still patching rust to turn i686-pc-windows-gnu into something that acts like -gnullvm but is still called -gnu. Thinking about this more, now I think this would be OK, it would exclude _alloca but we know everything works without it because there was no _alloca provided by this crate previously (only __alloca).

I'll try invoking the expert: @mstorsjo, any thoughts?

Sorry, I don't quite know the whole picture of what's going on here, but I'll try to add some comments/observations.

I am still not entirely clear on the use case for this (at least on windows-gnu*). It seems libgcc is always linked in for -gnu, which obviously contains _alloca. compiler-rt is linked in for windows-gnullvm by virtue of using clang as the linker, right?

In the C world, when linking with GCC, it always links in libgcc. When linking with clang (in mingw mode), it either links in libgcc or compiler-rt builtins. So the object files providing this symbol should always be included. If linked with -nodefaultlibs or -nostdlib or similar, it is omitted though.

Having both may or may not end up in a conflict. If the user provided object (where I guess the rust files are?) files contains _alloca, this should satisfy most dependencies for that symbol (I think, at least based on how lld resolves symbols), so nothing would pull in that symbol from libgcc/compiler-rt at all, and the conflict might be averted. If the object file from libgcc/compiler-rt gets pulled in for another reason (where the reason can be linking with --whole-archive or similar, or that this object file provides some other symbol, that is needed), then the conflict appears again.

I think it was determined that the object file in libgcc where _alloca is defined is being pulled in by virtue of defining another needed symbol. At least that's the assumption I've been under.

$ nm -B _chkstk.o
00000000 b .bss
00000000 d .data
00000000 t .text
00000000 T ___chkstk
00000000 T __alloca

Also, another possible complicating factor is that in rust's case, this crate seems to end up provided by a shared library (DLL), so we're actually dealing with symbols coming from an import library (libstd-394a669d59370585.dll.a) vs those coming from a static library (libgcc.a)

What if a __chkstk function was brought back, that is an alias to (or jmps to) _alloca, to match libgcc? Seems like that would prevent the libgcc object file from being pulled in.

For fun, trying to see where the ___chkstk reference is coming from:

/d/M/mingw-w64-rust/src/MINGW32/build/i686-pc-windows-gnu/stage0-rustc/i686-pc-windows-gnu/release/deps
$ for f in *.rlib; do nm -B $f | grep ___chkstk  && echo $f; done
00000000 r .rdata$.refptr.___chkstk
00000000 r .rdata$.refptr.___chkstk_ms
00000000 R .refptr.___chkstk
00000000 R .refptr.___chkstk_ms
         U ___chkstk
         U ___chkstk_ms
librustc_llvm-49f9d8b394cf6fe3.rlib

$ mkdir foo
$ cd foo
$ ar xo ../librustc_llvm-49f9d8b394cf6fe3.rlib
$ for f in *.obj; do nm -B $f | grep ___chkstk  && echo $f; done
00000000 r .rdata$.refptr.___chkstk
00000000 r .rdata$.refptr.___chkstk_ms
00000000 R .refptr.___chkstk
00000000 R .refptr.___chkstk_ms
         U ___chkstk
         U ___chkstk_ms
DynamicLibrary.cpp.obj

This seems to be coming from the static libLLVMSupport.a

What if a __chkstk function was brought back, that is an alias to (or jmps to) _alloca, to match libgcc? Seems like that would prevent the libgcc object file from being pulled in.

Indeed, I added the following to src/x86.rs (at the start of the intrinsics! { block), and it is going further. I will see if the full rust toolchain build succeeds with this, and if so if it also works on gnullvm.

    #[naked]
    #[cfg(all(
        windows,
        target_env = "gnu",
        not(feature = "no-asm")
    ))]
    pub unsafe extern "C" fn __chkstk() {
        core::arch::asm!(
            "jmp __alloca", // Jump to __alloca since fallthrough may be unreliable"
            options(noreturn, att_syntax)
        );
    }

Yes both i686-pc-windows-gnu and gnullvm build all the way through with that addition.

What if a __chkstk function was brought back, that is an alias to (or jmps to) _alloca, to match libgcc? Seems like that would prevent the libgcc object file from being pulled in.

This does sound like a reasonable solution, and it sounds like it works well in practice.

Overall, having a toolchain library object file supply more than one symbol, has been a cause for such conflicts many times. See mstorsjo/llvm-mingw#397 for a very similar issue recently, where we had two __chkstk style symbols in compiler-rt, wine provided one of them themselves, but needed to link in the other one. We fixed that by removing the extra symbol from compiler-rt in llvm/llvm-project@885d7b759b5c166c07c07f. If weโ€™d really need it, we should provide it in a separate object file, so the linker can take one but skip the other.

edit: Thatโ€™s not the commit I was thinking about. I actually meant llvm/llvm-project@248aeac.

What if a __chkstk function was brought back, that is an alias to (or jmps to) _alloca, to match libgcc? Seems like that would prevent the libgcc object file from being pulled in.

This also seems like a reasonable solution to me. Feel free to open a PR for that. ๐Ÿ‘