rust-lang/wg-cargo-std-aware

`compiler_builtins` seems to be missing symbols

tomaka opened this issue ยท 19 comments

This is the continuation of rust-lang/compiler-builtins#334

When compiling a pure Rust crate with -Zbuild-std=core,alloc, I get missing symbols errors for everything related to memcpying (memcpy, memset).
This is covered by the mem feature of compiler_builtins, which for a reason I don't understand isn't enabled by default.

Until mid-December can be solved by explicitly depending on compilter_builtins with the mem feature. However, one of the nightlies broke this and when doing so one now gets:

multiple rlib candidates for compiler_builtins found

I believe that the original cause is the features not enabled on compiler_builtins.

One workaround is to add a dependency on rlibc, which provides these symbols. You might need to add an extern crate rlibc; statement to ensure that it is linked.

ehuss commented

@alexcrichton can you verify โ€” the reason the mem feature is not needed for std is because those symbols are exported from libc?

That's a serious complication for building std on platforms without libc. ๐Ÿ˜ž

Ah yes, this is something we'll need to handle. Most platforms don't enable the mem feature of compiler-builtins because it's not necessary and the platform libc typically has better optimized routines than what's implemented in compiler-builtins. The build system however sometimes enables mem, notably for no_std targets, where typically there is no other platform libc.

How Cargo would know what to do here is... unknown. Although arguably if we "fix" target-specific features we could have target-specific dependencies on compiler-builtins which enable the mem feature for only a whitelist of targets, so this could theoretically be fixed with your work @ehuss.

This is relevant for the msp430 target, FWIW.

Two solutions are to bring in libc as a dependency (-Clink-arg=-lgcc) or link against rlibc as @phil-opp mentions (the libc solution being about 400 bytes savings in debug mode b/c well, libc is optimized :P).

However, when I do debug builds with -Zbuild-std=core, with using the "link to lib C" solution, I get some worrying (to me) warnings when certain conditions are met (#55).

ehuss commented

Note: there are two semi-related things that need some investigation:

  • Supporting JSON spec targets. compiler-builtins (and several other crates) do sniffing based on the target name, which obviously won't work with JSON spec targets.
  • rust-lang/cargo#8239 gives an example where compiler-builtins symbols are not found when using ld. I did some experimenting, and with the mem feature enabled, I get undefined symbols from memcpy (and friends) to various things in core. I'm not really sure why that is.

I am building for a no_std platform, where I have to supply the mem functions (all flash access must be word aligned), unfortunately with the merge of rust-lang/compiler-builtins#357 I can't find a way to opt out of the mem option, other than renaming my target not to include -none.

Any suggestions?

Are there any longer term plans to allow the build-std feature to customize some of the crates it builds?

Are there any longer term plans to allow the build-std feature to customize some of the crates it builds?

The -Zbuild-std-features unstable flag exists (see rust-lang/cargo#8490), which allows users to set features when using -Zbuild-std. However, I don't think the mem feature is wired up correctly. I'll take a look and see what I find out.

@josephlr I think it'd Just Work if std had a compiler-builtins-mem = ["alloc/compiler-builtins-mem"] feature in Cargo.toml, since alloc already exposes the feature, but it's unusable from build-std-features

@DianaNites Will that work when using -Zbuild-std=core?

@toothbrush7777777 I don't see why it would? If you're asking for just core to be built, why would it build compiler_builtins?

@DianaNites I read somewhere that core built compiler_builtins. Maybe that is wrong. Anyway, it would be nice to be able to have the mem feature of compiler_builtins enabled without building alloc.

It's also my understanding that core relies on compiler_builtins as part of the build.

Ah yeah requiring alloc, good point. Though idk if it actually does? build-std is weird, and build-std-features even weirder, they're the features of std right? but whose using build-std to build.. std?

core doesn't list any dependencies, so idk. Might it just rely on the symbols compiler_builtins provides, during linking? alloc depends on both core and compiler_builtins, bringing it in, though std depends on all 3 so..

@josephlr I think it'd Just Work if std had a compiler-builtins-mem = ["alloc/compiler-builtins-mem"] feature in Cargo.toml, since alloc already exposes the feature, but it's unusable from build-std-features

This is essentially correct. Right now, feature resolution is done though the test crate regardless of what crates you actually end up building. This means that the mem feature needs to be forwarded from alloc, to std, to test. This is similar to what other features already do.

@DianaNites I read somewhere that core built compiler_builtins. Maybe that is wrong. Anyway, it would be nice to be able to have the mem feature of compiler_builtins enabled without building alloc.

You're correct, building core will automatically build compiler_builtins, similar to how std automatically builds all of the crates std depends on.

Ah yeah requiring alloc, good point. Though idk if it actually does? build-std is weird, and build-std-features even weirder,
they're the features of std right? but whose using build-std to build.. std?

This (building core with the mem feature on) is my goal as well. Due to the resolution hack linked above, once we forward things correctly, the mem feature should work even if you're only depending on core and compiler-builtins.

core doesn't list any dependencies, so idk. Might it just rely on the symbols compiler_builtins provides, during linking?

This is correct. core (and any Rust code generated by LLVM) depends on the symbols in compiler_builtins to link correctly. compiler_builtins depends on core as a "normal" Rust dependency. This circularity always struck me as weird, but it's necessary to deal with using non-Rust compiler-rt.

@tomaka @alexcrichton this can be closed now that rust-lang/rust#77284 is merged.

ehuss commented

@josephlr I wouldn't consider that a complete fix for this issue, as it appears to require manually setting the compiler-builtins-mem feature, is that correct? I would prefer to not force the user to need to know about internal feature names.

@josephlr I wouldn't consider that a complete fix for this issue, as it appears to require manually setting the compiler-builtins-mem feature, is that correct? I would prefer to not force the user to need to know about internal feature names.

If you're building for a custom target, you will have to tell cargo whether or not your platform provides implementations of memcpy or not, so there not much more we could do here. There's going to have to be a build-time switch.

Actually, @ehuss I think you might be on to something here. In an ideal world, users would be able to override memcpy, but if they didn't we would fallback to compiler_builtins::memcpy automatically. Users shouldn't have to deal with this stuff directly. The solution to this is weak linkage (see the Rust tracking issue).

Here's how I think this all should work long-term:

  • compiler-builtins no longer has a mem feature
  • Instead the memcpy functions are declared in compiler-builtins with #[linkage = "weak"]
  • If a user manually provides a "strong" version of memcpy (or are on a platform like Linux that does it for them). That version is automatically used
  • Otherwise, the "weak" fallback implementation is used.

This would be the idea state of affairs in my opinion, but it probably requires some subset of rust-lang/rust#29603 to be on a path to stabilization before we can use it. Note that compiler-builtins currently uses weak linkage for some arm targets (see rust-lang/compiler-builtins#378), but memcpy and friends are used on essentially every target, so there would need to be some way for weak linkage to work on all targets/linkers.

This might be possible in the future, but probably can't be done right now.

FYI, I submitted rust-lang/compiler-builtins#411 a few years ago and it's been possible to override the memcpy and family of functions downstream ever since.