What will the UX be for enabling "build libstd" logic?
alexcrichton opened this issue · 14 comments
This is intended to be a tracking issue for the eventual state of how one actually enables Cargo building libstd. Currently this is done via the -Z build-std
flag. The -Z build-std
flag takes an optional list of parameters, but this optional list will ideally no longer be necessary once #5 is implemented.
What should the eventual syntax for -Z build-std
end up being in that case? (assuming the value of the flag is removed). A strawman proposal might be to add a configuration value build.std
(defaults to false
) to the configuration. That way CARGO_BUILD_STD=true
would enable it for a build, you could also use .cargo/config
, and with rust-lang/cargo#6699 could even be --config build.std=true
as a CLI flag.
One question would be if we can infer this configuration value in some situations. For example if libstd is missing for the target you're building for, should we default to building libstd? In any case, things to think about!
The feature gate to opt into this feature while it is unstable should be a -Z
flag and/or an entry in cargo-features
, per https://doc.rust-lang.org/cargo/reference/unstable.html
As to the eventual behavior after the feature is stabilized, I feel that should not be a flag at all for this. Rather, Cargo should try to compile the standard library automatically if a binary is not already available for the requested (target, profile configuration)
.
Oh sorry right good point, should probalby discuss that possibility!
I'm personally worried about automatically inferring the profile of libstd and switching to -Z build-std
by default for many builds. For example libstd never matches a cargo build
profile by default because libstd is optimized and without debug assertions. Similarly libstd also doesn't match the default cargo build --release
profile because it has debuginfo. In that sense I think that the profile configuration mismatches too regularly to have "false positives" for changing a profile to indicate that the user wants to recompile the standard library.
Even for missing targets though, that one's also pretty tricky. It's difficult to distinguish between the case of "I forgot to rustup target add
" vs "this is a brand new target and I really want to recompile std". For example not all targets are likely to work in the long run with -Z build-std
like MinGW ones, or similarly we're able to build more "featureful versions" on CI sometimes which include things like fuzzers, C code in compiler-builtins, Linux binary compatibility, etc.
I'd love to eventually get to a world where we could automatically infer this, but I fear that it's pretty far away that we could robustly automatically infer whether -Z build-std
was desired or not, so that's sort of why I'm thinking that an opt-in will be required.
For example libstd never matches a cargo build profile by default
Right, I also think that we should not change the semantics of existing manifests that run on the stable channel. This implies that [profile]
should not affect the standard library by default (like before), unless some opt-in key is present.
"I forgot to rustup target add"
That’s a good point. When a std binary can be obtained either by spending bandwidth (rustup target add) or CPU time (recompile), there’s not an obvious answer as to which is prefereable.
not all targets are likely to work in the long run with -Z build-std like MinGW ones
In theory shouldn’t they work, if for example MinGW (or whatever other requirement) is available?
Ah yeah it's also worth pointing out that my proposal at the top only included .cargo/config
, but having one in Cargo.toml
also seems reasonable! (via profiles and such)
MinGW/wasi/etc are unfortunately pretty weird in how they're integrated with the compiler, but it's probably best to keep discussion for that on #30.
There are cases where it might make sense to re-build libstd implicitly.
For example, if the target-feature
or target-cpu
of the target are overridden, libstd
should be always re-built using those to avoid soundness bugs, or cargo might want to preemptively error if this cannot be done.
Ideally, at some point, rustc would be able to error when these issues happen, but that might disrupt the workflows of people using -C target-cpu=native
, so it would be IMO better if these workflows would be transparently migrated to just rebuild libstd by default.
I wasn’t aware of such soundness issue, could you say more? It sounds worrying, since -C target-cpu=native
and such are stable.
In a nutshell, target-features are part of the call ABI, but Rust does not take into account, yet. For example, for an x86 target without SSE linking these two crates shows the issue:
// crate A
#[target_feature(enable = "sse")] pub fn foo(x: f32) -> f32 { x }
// crate B
extern "Rust" {
fn foo(x: f32) -> f32;
}
unsafe { assert_eq!(foo(42.0), 42.0) }; // UB
The ABI of A::foo
is "sse" "Rust"
but the ABI declared in B::foo
is just "Rust"
, no SSE, since the SSE feature is not enabled globally. That's an ABI mismatch and results in UB of the form "calling a function with an incompatible call ABI". For this particular case, A::foo
expects the f32
in an xmm register, while B::foo
expects it in an x87 register, so the roundtrip of 42.0
via foo
will return garbage.
Instead of using #[target_feature]
on a per function basis, one can just use -C target-feature
to cause this issue for a whole crate. And with -C target-cpu
, you can cause these issues for multiple features simultaneously.
Now, most features are compatible, e.g., if you have a target like x86_64-unknown-linux-gnu, which enables SSE2, and link against it a crate that enables SSE3 on top, then there is no change in the calling convention, and everything works. But if you try to, e.g., use something like RUSTFLAGS="-C target-feature=+soft-float"
to disable the use of hardware floating point, then if the standard library was compiled with hardware floating-point, every time you pass a float to libstd you'll get UB.
The real fix for this would be for rustc to consider target-features part of the function ABI (e.g. "sse2" "Rust" != "sse3" "Rust"
), and then once it can detect that these two ABIs are different, generate shims to adapt them as necessary, e.g., just like we do for all other ABIs (and reject function pointer types). This is a lot of work. There are a couple of temporary fixes that we could introduce (e.g., we could... pass all floats by memory... just like we do for simd vectors...), but if cargo could automatically recompile libstd, then that would definitely fix the issue at least for -C target-feature/cpu
which can be used without unsafe code. One could still introduce the issue with #[target_feature]
, but at least that's explicit in code, is unsafe to use, and not all features are available on stable (and disabling features is not allowed here either).
The issue with -C target-feature
and target-cpu
is that it may be a soundness issue, not a guaranteed soundness issue, which means that there's likely a lot of users already using it who don't need to recompile std. In that sense I think this the same that I mentioned above in that it runs a high risk of returning a false positive of "this is a situation where libstd must be built".
I think the fact of the matter is that we just do not have a great mechanism of automatically inferring "libstd must be built from source". I think that factors into the UX of how we want to enable this feature eventually in terms of it should likely be opt-in but should also likely be very lightweight to opt-in (like edition = "2018"
). I think a CLI flag is likely too onerous for situations that want to rebuild std to get stabilized eventually.
I think the fact of the matter is that we just do not have a great mechanism of automatically inferring "libstd must be built from source".
Agreed.
I think that the appropriate fix for the soundness bugs is for rustc
to error when crates with incompatible target-features are linked, hinting users that they should compile their code with compatible features.
The role of cargo
would then be to provide nice workflows for doing that. Inferring incompatible target-features is tricky, so I don't think cargo should try to do that. I also don't think it should try to use any simple heuristics (e.g. "if RUSTFLAGS then recompile std"), because most users don't need that.
Maybe the .cargo/config.toml
would be a good place for such a flag. Things like RUSTFLAGS
can already be specified there. It would also then get an environment variable, so those using RUSTFLAGS="..." cargo build
that want to recompile libstd could do that without having to edit the .cargo/config.toml
by using the env var instead, e.g., CARGO_REBUILD_STD=1 RUSTFLAGS="..." cargo build
or similar.
What's the status of this? The current requirement to pass the -Z build-std
flag to every cargo invocation is very cumbersome, with the result that many projects keep using xargo
/cargo-xbuild
for now.
I know that there are a lot of details that need to be carefully worked out, but would it be possible to add some sort of unstable configuration key to the .cargo/config
or Cargo.toml
to just enable the -Z build-std
flag permanently for a project?
.cargo/config
seems alright as it's never published (to e.g. crates.io) so there is no compatability concerns.
I actually have the opposite thought. If a lib crate depends on getting a fresh libstd or libcore relying only on .cargo/config
wouldn't seem to allow that.
What's the status of this?
I implemented a prototype with explicit std dependencies in Cargo.toml
, but it turns out to be really fussy and complex. We are exploring a different approach where the user does not need to list any crate names when enabling build-std. This requires some significant changes to the standard library so that it builds on all platforms, so we are still exploring the feasibility of the approach.
I have posted a proposal for feedback at https://internals.rust-lang.org/t/build-std-and-the-standard-library/11459.