rust-lang/rust

Feature gates for renamed features hard error, breaking "nightly-detecting" crates

Opened this issue ยท 26 comments

A common pattern in the wild is to do this kind of thing in build.rs:

        if channel == "nightly" {
            println!("cargo:rustc-cfg=feature=\"something\"");
        }

Where something is a Cargo feature that is used in code as #![cfg_attr(feature = something, feature(cool_new_feature))].

This is generally considered an anti-pattern. However, despite being proscribed, it is found often in the ecosystem.

A more advanced version that actually build-tests some code with the feature is found in some popular crates; that version is less problematic as it will silently fall back to the stable behavior when the feature changes.

This leads to a really bad situation when the feature is has its name changed or split up into smaller features. It is currently occuring with the ahash crate, as can be seen in tkaitchuck/aHash#200.

People depending on the crate and running nightly get a hard error with the crate because the feature flag "no longer exists". If the crate removes the feature flag, that's an immediate MSRV bump for everyone else since the crate uses still-unstable APIs.

It's particularly annoying in CI where people typically have a single Cargo.lock.


Most of the time a nightly feature has its name changed; there's a decent gap between the point in time in the name changes and when actual changes to the APIs behind it happen. Which means in most of these cases, the only thing broken about this code on nightly is the feature(cool_new_feature) line, the rest will still work.

Given that this is a technique found in the wild, would it make sense to have this error behave according to cap-lints instead, at least in the short run after a feature is renamed, with some tracking of the rename? Every time I've seen this cause breakage there's a whole lot of should-be-unnecessary scrambling to fix transitive dependencies.

It is a warning already. In the ecosystem breaking cases, the feature wasn't stabilized, but renamed and changed. We could try to have more compat around feature renames/changes but idk, in the end the its very obvious where the bug with the ecosystem breakage lies..
But I agree that something should be done, but what should be done is find ways to stop people from doing this.
https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=f76be81e8644c2e5f9f6bda83fe16a4c

We should probably have stronger guidance somewhere that people should not do this.

@dtolnay As a member of libs-api, I must ask you to please reconsider your abuse of the existing APIs. Please also write the guidance explaining why people should not do this. You, surely, know better than anyone.

As far as I know, my crates do not have this issue.

@dtolnay Unfortunately, I see someone commenting that your crates like thiserror or anyhow have broken their build fairly routinely ("every day" would be an exaggeration, but not by much), and the use of version detection or not is increasingly becoming a criterion for selecting for or against your crates. I realize it is very annoying that they do not, but not everyone files bug reports (often more out of not wanting to waste your time with a low-quality report, and a lack of interest in root-causing the issue more deeply).

Regardless, if you feel that you have perfectly threaded the needle... which must have been only in the past month or so, considering how recently you fixed the RUSTC_BOOTSTRAP detection that caused your own tool, cargo expand, to bug thiserror...... I would kindly ask that you at least write down the long list of considerations on Doing It Properly, especially the assumptions and criteria for doing this that did not actually make it into your code.

@dtolnay I'm not sure about the current state of affairs but I definitely know that in the past few years there have been multiple occasions where I've had to cargo update -p proc-macro2 because proc-macro2 was broken on nightly CI, despite being on a dependency tree that did not opt in to nightly features. proc-macro2 is not the only crate I've had troubles with over here, but it's the one that crops up the most.

I don't think I've ever filed bugs about this, and if that is why you haven't been aware of the problems here I apologize for not bringing it to your attention earlier, I assumed it was an intentional weighing of tradeoffs.

FWIW, if the build script actually does feature detection (actually trying to compile something to check the feature name and interface) rather than only nightly detection (like in the snippet in the issue description), then nothing breaks if the feature is renamed or changed.

Whether such feature detection in build scripts is desirable/acceptable is a discussion to be had, but separate from the issue reported here.

This issue is about nightly-detecting crates breaking when a feature gets stabilized. But, as pointed out above, that isn't actually an issue: today, #![feature(stabilized_feature)] already is a (cap-lints'able) lint rather than a hard error.

The actual issue shows up not when a feature is stabilized, but when a feature is renamed or changed. To address that issue, we could consider making #![feature(nonexistent_feature)] a lint rather than an error, but that wouldn't solve the issue: it wouldn't enable the unstable feature that the crate would then rely on. It is unavoidable (and expected and fine) to break crates that make assumptions about unstable features.

I don't think this specific issue is actionable.

I'm not sure about the current state of affairs but I definitely know that in the past few years there have been multiple occasions where I've had to cargo update -p proc-macro2 because proc-macro2 was broken on nightly CI, despite being on a dependency tree that did not opt in to nightly features.

For context: proc-macro2 did in the past break on nightly and had to be updated when we changed the interface of the proc_macro_span feature, but as of a month or two ago its build script actually checks the expected interface (https://github.com/dtolnay/proc-macro2/blob/master/build/probe.rs). In other words, proc-macro2 no longer breaks if the feature is renamed or changed (or stabilized).

(Again, whether such feature detection in build scripts is desirable/acceptable is an interesting discussion, but not the issue reported here.)

The actual issue shows up not when a feature is stabilized, but when a feature is renamed or changed. To address that issue, we could consider making #![feature(nonexistent_feature)] a lint rather than an error, but that wouldn't solve the issue: it wouldn't enable the unstable feature that the crate would then rely on. It is unavoidable (and expected and fine) to break crates that make assumptions about unstable features.

I don't think this specific issue is actionable.

I kind of addressed this already in the issue:

Most of the time a nightly feature is stabilized; there's a decent gap between the point in time in which all the functionality is done and when the feature flag is "removed". Which means in most of these cases, the only thing broken about this code on nightly is the feature(cool_new_feature) line, the rest will still work.

That's true for feature renames as well. Yes, there is no guarantee about things still working. However, typically these renames and things are done independently of further changes. It would be ideal to limit breakage due to them; I disagree with "this issue is not actionable".

It's bad but expected when a crate breaks because a nightly feature it used changes API. It's completely gratuitous when it happens because of shuffling around of feature names.

Right, so you're saying that we would still hard-error on #![feature(nonexistent_feature)] but be more forgiving for #![feature(old_name_of_renamed_feature)] (and have that imply the new feature)?

That can make sense, though it means we have to track renames of course.

But I think that most renames happened as part of a split or merging of features. E.g. part of a feature is stabilized and another part is renamed. That makes it very tricky. I don't want us to put too much effort in ensuring some kind of stability of unstable features, of course.

This leads to a really bad situation when the feature is stabilized. It is currently occuring with the ahash crate, as can be seen in tkaitchuck/aHash#200.

Nothing was stabilized, so this summary is incorrect. The problem is that a feature got renamed. Would be good to get the issue description updated. :)

I agree with the general sentiment above: auto-enabling of nightly features is generally just wrong. They are, by their very nature, unstable. aHash enabled features based solely on the fact that this was a nightly compiler and the feature is not disabled by allow-features, which is extremely fragile as any change in the way the feature works will break building that crate. I don't think it is worth spending any time on making that case work better; we should spend that time on more clearly advising against doing this (see e.g. SergioBenitez/version_check#23).

@dtolnay's crate are less naive and do pretty clever auto-detection of whether the feature still works the intended way. That works a lot better. I am not actually aware of a case where the "build probe" worked but then the code still failed to build; the problem here is rather that implementing such a "build probe" correctly is extremely hard and I don't think it is even possible to do it in a way that works for rustc bootstrap.

Nothing was stabilized, so this summary is incorrect. The problem is that a feature got renamed. Would be good to get the issue description updated. :)

Yes, that's been covered already in this thread

Right, so you're saying that we would still hard-error on #![feature(nonexistent_feature)] but be more forgiving for #![feature(old_name_of_renamed_feature)] (and have that imply the new feature)?

Yes. We track renames for lints and things already.

I'm proposing a best-effort solution; we don't try and fix the problem entirely, we just try. I think that's an okay thing to do; will not be a huge deal and will be nicer on the ecosystem. This isn't the first time this has happened, and this is a crate that is deep down some deptrees (including in hashbrown), such that the ecosystem isn't on the latest version and cannot be cargo update -p'd out of the problem. I've currently got open backport PRs on hashbrown and ahash in an attempt to solve the problem in multiple places. At wasmer's level the problem is quite complicated to tease apart.

Yes, that's been covered already in this thread

The issue summary at the top is still wrong though, which will mislead new people finding this issue.

One solution to this problem I have been using locally is to have my nightly toolchain masquerade as non-nightly by removing the channel from the version output. So far that has seemed to be effective, I can currently build ahash 0.7.7 just fine. I have been very tempted to propose on MCP to add this behavior to rustc itself, and even wrote up a draft a while ago.

The issue summary at the top is still wrong though, which will mislead new people finding this issue.

Edited, but it is typical to expect people commenting on an issue to read some of the discussion.

issue discussed during T-compiler triage meeting (on Zulip).

So, unsure if we want to do this:

  • It could make sense trying to comb crates.io and identify those crates that probe for a nightly in a incorrect way (even if it doesn't solve the past)
  • We want to discourage this kind of nightly detection (but how?)

@rustbot label -I-compiler-nominated

Unfortunately some crate authors are very non-cooperative so we may have to look into technical means to prevent nightly detection, and thus prevent widespread ecosystem breakage when nightly features change.

Today, I once again had someone express to me a concern that their code might accidentally become dependent on nightly features. Today, I once again lied by saying it would not. Obviously, I remain aware of this issue, which makes my statement de facto false. It felt necessary, however, because:

  • some diagnostic features are limited to nightly
  • we were trying to fix a problem
  • in the majority of cases, it "won't matter"... until it does, because the feature e.g. compromises their program's soundness, or just their expectations

I should note that this feature was a compiler feature, not even a language feature, so no library author can usefully enable it. Compiler flags really require the person wielding cargo to do it. I don't know, exactly, what their project was depending on, so it is almost certain that some crate author in their dependency tree was in fact about to start opting them into nightly features, very "helpfully". And even though I said it doesn't matter, crates that implement this kind of feature-detection clearly believe otherwise. It must matter to them, or they would not do it.

Unfortunately, neither nightly nor the feature in question helped meaningfully discover an answer, despite it being an error sourced in a proc macro.

I do not have the time or energy to teach people what nightly actually entails while helping them solve relatively minor bugs, because I coach a lot of people through relatively minor bugs on a frequent basis. But it is fairly common for them to express considerable anxiety about even downloading a nightly compiler, much less running it, as they do not have the awareness that a compiler or stdlib developer does about where exactly the stability bounds of nightly begin and end. In a word, they mistrust it.

I mention this because a lot of people are focusing on the purely technical aspects of this issue, and I do not mean to diminish those, but it is also a breach of trust.

Today, I once again had someone express to me a concern that their code might accidentally become dependent on nightly features. Today, I once again lied by saying it would not.

If you build on stable you do not depend on nightly features, even with crates that do nightly detection.

But reading on in your post, I assume the context here is someone switching from stable to nightly to get access to some particular nightly functionality, without wanting any other nightly functionality?

@RalfJung Correct. These programmers want control. Control that we at least nominally offer! It is basically true that if you use a nightly rustc but you enable no nightly features, nothing should fundamentally change about how your code is compiled. If we only consider the same rustc version (so, the nightly a stable compiler was cut from), no stdlib features should become reachable, and no language or compiler features should affect the compilation. Any variance is due to inevitable but regrettable bugs... and honestly, the biggest source of that variance is the number of fixes we will backport in the three months between the stable rustc and the nightly one.

In the case of crates that respect the ecosystem norms of explicit "nightly" opt-in, this control is real, even with regard to using dependencies.

Honestly I feel that T-compiler didn't have enough time on that meeting to strategically think about this issue. I'm inclined to nominate it again because I don't see a clear agreement on a path forward. I have not fully digested everything so my understanding is a bit spotty but one thing I get is that we have people out there in the ecosystem making assumptions out of unclear facts that tends to become truths because there since long.

I feel it is at least important to decide whether we want to fix this "the nuclear way" and force people to stop using this way of detecting nightly (hurting the ecosystem but at the same time fixing it once and for all1) or else we want to accept what some crates "dictate" and taking into account @workingjubilee reasoning.

Either way or the other. But at least avoid staying in this grey zone.

Footnotes

  1. until next time โ†ฉ

Another reason why it's bad for crates to automatically enable nightly features: this means crater will not be able to check whether they would actually keep working on stable. This just happened in #123281.

#124339 will allow crater to overwrite most nightly detection schemes by reporting version information the same way as stable does.

Combined with setting the RUSTC_STAGE env var, this disables all of the nightly detection schemes I know of.

Unfortunately SergioBenitez/version_check#23 did not get accepted, the change that landed instead uses much weaker wording. One of the points brought up in the discussion is that

Unfortunately it doesn't seem that any documentation anywhere defines what the user's expectations of each channel should be.

Maybe this needs an official statement by the lang and/or compiler team that automatic enabling of nightly features without any sort of build probe is not okay since it impedes our ability to experiment with nightly features while also keeping nightly a viable option for people to test their crates with.

#124339 establishes the following T-compiler policy:

We do not consider feature detection on nightly (on stable via compiler version numbering is fine) a valid use case that we need to support, and if it causes problems, we are at liberty to do what we deem best - either actively working to prevent it or to actively ignore it. We may try to work with responsive and cooperative authors, but are not obligated to.

Should they subvert the workarounds that nightly users or cargo-bisect-rustc can use, we should be able to land rustc PRs that target the specific crates that cause issues for us and outright replace their build script's logic to disable nightly detection.

The FCP for this has finished.

I'll patch the PR real quick so we can finally land it. Just forgot about it