Solving the problem with promotion

Question

Solving the problem with promotion

RalfJung opened this issue 5 years ago · 17 comments

Over the years, promotion has caused many problems. The point of this issue is to collect those cases, which hopefully serves as a useful argument when we start to deprecate some of those mechanisms or when we reject proposals to promote more things.

What is promotion?

The short version is "it makes &expr have static lifetime for some expr". For more details, see here.

What are the problems?

The earliest critical problem I am aware of is rust-lang/rust#50814, where I made Miri detect arithmetic overflows even in release mode and promotion made that unsound because it lead to unexpected const-eval failures. We later plugged that kind of soundness hole for good but it can still lead to unexpected program aborts, which is not great.

The high-level summary is that we have a problem when the promoted constant fails to evaluate: it might be in dead code, so we cannot just make compilation fail (that would mean that our decision to promote caused otherwise valid code to fail to compile). But we have to do something. All other constants stop compilation when they fail to evaluate, so this requires special cases all over the place (we have checks for "is this a promoted" in const error reporting, in codegen, in Miri, and probably more places; also see rust-lang/rust#75461, rust-lang/rust#71800).

Most recently, we realized that our required_consts scheme, which is supposed to ensure that we do not compile functions that use ill-formed consts, also needs a special case as we can indeed have ill-formed promoteds.

@oli-obk collected some more problems specifically around promoting const fn calls in #19.

Other "fun" cases of promotion requiring a special case:

rust-lang/rust#67534: We promote something to constant memory even though it contains interior mutability, as long as that interior mutability isn't used by the final value of the promoted.
rust-lang/rust#67465: We promote unsafe computations that return invalid data.
rust-lang/rust#61821: Diagnostics in promoteds are somewhat borked, and this is pretty hard to fix.

What could be the solution?

First of all we tried hard to limit promotion and even de-promoted some functions (rust-lang/rust#67531, rust-lang/rust#71796). We distinguish between "implicit" and "explicit" promotion context (see here, and are less careful in "explicit" contexts such as const initializers. However, even those can have dead code, so all the problems caused by promoteds that fail to evaluate still apply.

But what is the endgame here, what is a subset of promotable expressions that actually avoids all these problems? The largest possible subset is the set of expressions that cannot fail to evaluate. This covers almost all of the above problems; if we consider validation failures (such as invalid results of unsafe code or unexpected interior mutability) to also be evaluation failures, then I think this indeed covers all the problems.

See this hackmd for the details of that possibility.

So the least thing would be to forbid all of fallible operations. Another choice, perhaps easier to explain, would be to say that we only promote things that would also be legal as a pattern. That would mean de-promoting all arithmetic and logical operations as well -- but it is not clear that we can get away with that, so we might have to settle for infallible operations.

The intended replacement for code that wants things to have static lifetime is to explicitly request that lifetime via rust-lang/rfcs#2920. Compilation fails when that const block fails to evaluate, which means that promotion itself cannot fail, thus satisfying the infallibility condition.

Answer 1 · 2020-08-13T07:17:05.000Z

One way that satisfies just the "only promote things that we can successfully evaluate" would be to

do promotion, but don't modify the original MIR yet
try to evaluate the promoted
- if successful, just finish the promotion
- if not successful, look for a const wf bound (cc @lcnr) that matches the promoted
  - if there is one, just finish the promotion
  - if there is none, put the promoted into a debuginfo datastructure and let borrowck check that datastructure when encountering lifetime errors. In case of lifetime errors, suggest a wf bound matching the promoted

This will still be a breaking change, but it will be much less of one (and also not cause post-monomorphization errors)

Answer 2 · 2020-08-13T07:28:00.000Z

Yes that would be like a fallback solution. The reason I like it less is that it is "too smart" -- that makes it really hard to predict what will happen, and now the lifetime of something can depend on subtle details of const evaluation. I would hope that at least in implicit promotion contexts, we can do better, since we promote less.

(EDIT: the following is outdated, see my next post)

But in explicit contexts, we have all the same problems! They can have dead code, too:

const FOO: i32 = {
  if false { &promotable }
}

So all the concerns about creating promoteds that fail const evaluation apply here, too, I think? All the time we were concerned about changing behavior due to suddenly evaluating things at compile-time that used to be run-time-evaluated, but I now feel like the worse part is the one where we have consts (with separate const-eval queries) that fail to evaluate.

Note that another cause of const-eval queries are nullary functions. But those don't cause any problems I think. I wonder why?

Answer 3 · 2020-08-13T08:08:25.000Z

I think I overreacted re: explicit promotion contexts.

The property that I was going for all along here is: When a CTFE query fails, that's a hard error. Promoteds in implicit contexts are always evaluated during codegen, even in dead code, so this implies that evaluation of an implicit promoted must never fail.

But for explicit promoteds, we either already hard-error when they fail as codegen needs their result (e.g. rustc_args_required_const), or we just evaluate them "on-demand" (const items). So we can do anything we want here in terms of promotion I think, as long as we do not add promoteds to required_consts -- which makes sense, they are not required, as the user did not ask for them to be evaluated.

Answer 4 · 2020-08-13T08:15:53.000Z

We already check that explicit promotion succeeds or halts compilation:

https://github.com/rust-lang/rust/blob/814bc4fe9364865bfaa94d7825b8eabc11245c7c/src/librustc_mir/transform/promote_consts.rs#L801-L810

What we do not check yet, I think, is that evaluating the explicit promoted succeeds or compilation halts. That is a property we would need for this plan to work out.

Answer 5 · 2020-08-13T09:09:25.000Z

What we do not check yet, I think, is that evaluating the explicit promoted succeeds or compilation halts. That is a property we would need for this plan to work out.

But not proactively, i.e., only when the promoted is actually used. IOW, this should only lint on implicit promoteds:

https://github.com/rust-lang/rust/blob/d69b0997d7dcc99a188d5cb19137dedc4fc05d25/src/librustc_mir/const_eval/eval_queries.rs#L362

Can we store the information whether a promoted is implicit or explicit? That code currently checks the def_kind, but I think that will not work for rustc_args_required_const.

Answer 6 · 2020-08-13T11:06:08.000Z

I think the only place we can store it reasonably is in the mir::Body

Answer 7 · 2020-08-13T15:28:17.000Z

Turns out we have yet another huge problem with promotion caused by their "design" (if you want to call it that) never having been written down and discussed at a high level.

Answer 8 · 2020-08-13T18:36:19.000Z

When a CTFE query fails, that's a hard error.

A bit off-topic, but this can't hold in general due to the const-propagation transform. Maybe you just mean for CompileTimeInterpreter and not ConstPropMachine?

Answer 9 · 2020-08-13T18:43:47.000Z

Ah sure, ConstProp can just bail out and be like "I'm not touching this" on errors, that's fair.

(Though if all consts reaching codegen must successfully evaluate, I think the only bad consts ConstProp might encounter are promoteds in explicit promotion context in dead code.)

Answer 10 · 2020-08-14T06:21:47.000Z

However, I wonder how to best practically do this with ConstProp. I was imagining that the queries would raise the hard errors, which helps a lot because it frees query consumers from the burden of having to think about such things. I see the following options:

ConstProp avoids calling the query on potentially problematic consts -- as in, on lifetime extension promoteds in const/static initializers.
We change lifetime extension in const/static initializers back to not use the promotion mechansm, thus avoiding the failing queries in the first place.
We somehow have a query that doesn't error, and another one on top of that that shows the hard error. But I think that requires putting all the stacktrace information into the query result...

Answer 11 · 2020-08-14T12:58:54.000Z

Actually ConstProp should be easy -- we just make TooGeneric not emit hard error, and instead propagate that to the caller of the query. That should be the only error ConstProp ever encounters that can be ignored. If it doesn't get TooGeneric, then codegen would have gotten the exact same result, so it is okay for this query to emit a hard error.

The only exception of course are const/static bodies, as they are not codegen'd. We can just not ConstProp them either... though we have some lints in ConstProp that we would miss then. Well, this is another argument for not doing promotion for those, but instead doing lifetime extension the old way (StorageDead removal).

Answer 12 · 2020-08-14T13:47:06.000Z

I have collected my current thoughts on this in a hackmd. Please feel free to expand that with missing information!

Answer 13 · 2020-08-14T18:38:16.000Z

Removing StorageDead as a substitute for promotion does not work once you can loop in a const initializer. It results in multiple StorageLives for the same MIR local with no corresponding StorageDead. You could hoist temporaries that are eligible for lifetime extension out of the loop, but at some point you're just reimplementing promotion.

const _: () = {
    loop {
        let _x: &'static i32 = &5;
    }
}

Answer 14 · 2020-08-15T07:43:30.000Z

@ecstatic-morse that's a good point, dang.

So, uh, what can we do? I don't suppose there is any hope of restricting lifetime extension in const/static bodies to just infallible code? The only other way then to still achieve "all CTFE failures are hard errors" is to make sure we only evaluate lifetime extended promoteds that we actually need -- doing it on-demand instead of eagerly. This affects MIR optimizations, they must be careful not to trigger evaluation of these promoteds.

Also, doesn't this entirely break promotion of mutable data? What if we have &mut [1,2,3] in a loop? I'll try to craft something. EDIT: rust-lang/rust#75556

@lcnr, thanks for your comment! However, I don't think hacmd makes any sense for discussion, it only makes sense for documenting the result of a discussion. So let me move your comment here:

From what I can tell we already have generic promoteds in associated constants: playground
So we can't just always eagerly evaluate promoteds...
I think requiring constants to be eagerly evaluated seems slightly inconsistent considering how we deal with generic constants.
(note: I really would have to spend a few more hours looking into this to form a more complete opinion here)

However, the example you give has nothing to do with promotion. See the Vec::new example. Also, the lazy/eager evaluation I am talking about has nothing to do with lazy/eager normalization of const generics. I am talking about the case where you have a fully monomorphic const body that you want to evaluate, and in some dead code it contains a rustc_required_const argument. If this was runtime code, we'd still evaluate that argument because it has to happen during codegen. For CTFE we could in principle decide to only evaluate such arguments on-demand when they are in live code.

In the light of what @ecstatic-morse noted, since we have to evaluate lifetime extension promoteds lazily in const/static bodies, we might as well do the same with other promoteds...

@oli-obk same for your comment

These are codegen’d, so we have to evaluate all promoteds, even in dead code.

do we have to even in dead code? I thought we want to, but theoretically we can just remove dead code during mir optimizations.

Yes we have to. It is impossible to determine all dead code -- see "halting problem". We could remove some dead code, but then we get an inconsistent mess, not a better system.

(Btw hackmd also has a dedicated comment mechanism, we could also try that. But then we'd have two threads of discussion and I do not know if hackmd sends out any notifications. Either way, using a free-form text editor as a discussion forum is a rather poor substitute IMO.^^)

Answer 15 · 2020-08-15T08:57:13.000Z

However, the example you give has nothing to do with promotion. See the Vec::new example.

Would the following use promotion in ASSOC then?

struct Foo<T>(T);

impl<T> Foo<T> {
    const ASSOC: &'static usize = {
        let x = &(4 / std::mem::size_of::<T>());
        x
    };

    fn dummy1(&self) {}

    fn dummy2(&self) {
        if false {
            Self::ASSOC;
        }
    }

    fn dummy3(&self) {
        Self::ASSOC;
    }
}

I was thinking about "When a CTFE query fails (except with TooGeneric), it should emit a hard error. That would make a lot of code much simpler, some of which having caused soundness problems in the past." here. Put my comment in the wrong position 😅

If const_err is a hard error, we would have to stabilize when we evaluate associated consts.

fn main() {
    // Which of these lines should try to evaluate `Foo::<()>::ASSOC` and cause a
    // hard error, which is what you proposed here if I understand correctly.
    let x: Foo<()> = Foo(());
    x.dummy1();
    x.dummy2();
    x.dummy3();
}

Answer 16 · 2020-08-15T08:59:13.000Z

Would the following use promotion in ASSOC then?

Yes.

If const_err is a hard error, we would have to stabilize when we evaluate associated consts.

That seems unrelated to promotion, the question remains the same if ASSOC is just size_of/0 (no &), right? So I think this is a subject for rust-lang/rust#71800, not this issue here. That's why I linked both of them at the top of this hackmd. :) They are related but clearly distinct.

Answer 17 · 2021-05-09T13:16:53.000Z

RFC 3072 got accepted, so most of the rest of the work here is tracked at rust-lang/rust#80619.