rust-lang/rust

Tracking issue for RFC 2344, "Allow `loop` in constant evaluation"

Centril opened this issue ยท 44 comments

This is a tracking issue for the RFC "Allow loop in constant evaluation" (rust-lang/rfcs#2344).

Steps:

Unresolved questions:

  • Should we add a true recursion check that hashes the interpreter state and detects if it has reached the same state again?
    • This will slow down const evaluation enormously and for complex iterations is essentially useless because it'll take forever (e.g. counting from 0 to u64::max_value())

This feature seems simple enough for me to tackle as my first attempt to contribute to rust, at least getting it to work appears to be as simple as removing the check against loops in constant functions.

I believe the real work actually goes into implementing the lint for potential infinite loops. I'm going to tackle this next and I'd really appreciate some mentorship along the way.

Implementing this feature is blocked on a bunch of definitely not beginner level things. But there's ongoing preliminary work happening, so this isn't standing still even if nothing happens on the issue.

If we'd just remove the check against loops, we'd end up with amusingly broken code, because most of the const checking code only knows it needs to prevent writing multiple times to a variable... which loops can do without naming the variable twice.

Anyway, guarding against infinite loops has been partially implemented, and if you're interested in doing some work there, we're tracking one of the issues in #52475

Thanks for pointing me to the right direction, I'll definitely have a look for something I can help with.

Would an iteration limit be an acceptable solution to the the infinite loop problem? If the iteration limit is actually reached the code can just be compiled instead!

Though this may be deceptive since the const fns wouldn't really be const if the limit is reached.

Though this may be deceptive since the const fns wouldn't really be const if the limit is reached.

It could be a compilation error, which would let the const fn continue to be const (code that fails to compile has no bearing in execution time ^_^).

True! Perhaps it would be worth adding a compilation option of some sort that would make it just a warning for edge cases.
Edit: reading this half a year later and i have no idea what i meant

@oli-obk

const checking code only knows it needs to prevent writing multiple times to a variable...

....sorry, I thought that compile-time evaluated code is run by MIRI; why would it matter whether a variable is written to multiple times?

Of course, any (useful) compile-time evaluation is ultimately going to result in static data in the resulting executable, so it makes sense for const values not to be written to multiple times. But shouldn't that already be guaranteed by mutability rules?

Additionally, if the Iter trait were to be stabilized for use in const fn contexts, I think an entirely reasonable resolution to this issue would be to enable only loops that rely on Iter. This would permit many simple obviously-terminating loops (the most obvious example being those iterating over range-literals), while limiting the potential for unbounded compiler execution to problems in the implementation of the iterator used.

An even simpler and forward-compatible solution would be to only permit iteration over range-literals for now.

^ Sorry, I thought this was an RFC PR, not a tracking-issue PR. Since this RFC explicitly states that for loops wouldn't be usable anyway, would it be worth writing a separate RFC about permitting for <var> in <range> in const fn?

@BatmanAoD My assumption here is that everyone wants to be able to use traits in const fn eventually, and that for will just naturally happen along with that.

sorry, I thought that compile-time evaluated code is run by MIRI; why would it matter whether a variable is written to multiple times?

The const evaluator based on the miri-engine is what actually interprets the MIR and produces the final constant. We have an additional analysis that checks whether your constant only does things we can statically prove to not cause a certain range of errors. E.g. we currently forbid unions and transmuting via this static analysis, even though miri can evaluate them just fine. The same goes for calling trait methods. We want the const evaluator to only evaluate things that we have RFCed. The same thing is true for this issue. It's trivial for miri to evaluate loops, but we don't want to accidentally cause Cell types to end up in static variables, and the current static analyis just doesn't know how to prevent that in the presence of loops.

@oli-obk Okay, I think I understand prohibiting transmuting and unions, since those enable operations that have effects that are invisible to the type system. I don't understand how loops would permit a Cell to be part of a static value, though; wouldn't that be part of the static value's type?

(...or, instead of answering that specific question, feel free to point me to a more general resource I can read in order to ask more intelligent questions...)

feel free to point me to a more general resource I can read in order to ask more intelligent questions

I think we're severely lacking in that place. The closest I can come up with is this issue

Maybe we can kill two birds with one stone: I explain it to you, and you write docs for https://github.com/rust-lang-nursery/reference/blob/master/src/items/static-items.md, https://github.com/rust-lang-nursery/reference/blob/master/src/items/constant-items.md and https://github.com/rust-lang-nursery/reference/blob/master/src/items/associated-items.md#associated-constants

I'll review the docs, and this way we'll notice if I explained it correctly.

There's three components and we'll start with the Cell one:

static FOO: Option<Cell<u32>> = Some(Cell::new(42));

Is not legal, because we'd be able to modify the static without thread synchronization. But on the other hand

static FOO: Option<Cell<u32>> = None;

is perfectly ok, because no Cell actually exists. The same goes for

static FOO: Option<Cell<u32>> = (None, Cell::new(42)).0;

because no Cell ends up in the final constant.

Bringing us to the second point: destructors may not be called at compile-time, because Drop::drop is not a const function. This means that

const FOO: u32 = (SomeDropType, 42).1;

is not legal, because the destructor gets called during the evaluation of the constant. On the other hand

const FOO: SomeDropType = SomeDropType;

is legal, because no destructor gets called during the evaluation of the constant. Any use of it might cause a destructor to be called, but that's not a problem here.

Now we get to the third problem:

trait Bar {
    const BAR: Self;
}
trait Foo<T: Bar> {
    const FOO: u32 = (T::BAR, 42).1;
}

Is problematic exactly if T implements Drop (either manually or due to a field impl-ing Drop), because then the evaluation of FOO would call T::drop. While we could emit an error only when using trait Foo<String> or similar, that is a post-monomorphization error, which we are trying to avoid as much as possible. Post-monomorphization errors are bad, because they mean you can write broken code, and only the users of your crate will notice, not you yourself.

There's also some material at https://github.com/rust-rfcs/const-eval/

@oli-obk I would definitely be up for writing some docs!

I'm still not clear on whether there's actually a problem with Cell, though. In current, stable Rust (and, as it happens, on Nightly), the static _: Option<Cell<_>> = None declaration you've shown is not legal: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=6e3f727044df4a298ab5681ba9a1a425

...so, is there a compelling reason why we would want that to be legal, even as const fn support grows?

For the second/third problem, I don't understand how const values that don't implement Copy can be used by-value at all, but clearly that is already legal in stable Rust, and in fact such a value can even be re-used multiple times: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=7a580a901449868e7b0ab9434a492a8d

In any case, when compiling a trait definition, pre-monomorphization, can the compiler determine where drop calls would be inserted? If so, perhaps const fn simply shouldn't be able to use any types that aren't copy except by reference? (Not knowing hardly anything about the compiler internals: does this sound like an overly complex check?)

For the second/third problem, I don't understand how const values that don't implement Copy can be used by-value at all, but clearly that is already legal in stable Rust, and in fact such a value can even be re-used multiple times

const values are inlined at compile-time. There is no copying going on at runtime. The compiled binary contains an instance of constant for every usage.

I'm still not clear on whether there's actually a problem with Cell, though. In current, stable Rust (and, as it happens, on Nightly), the static : Option<Cell<>> = None declaration you've shown is not legal:

Oh oops, well, replace static with const and add some indirection: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=79d9f50e2d50294ffc10f6f9bf5867f6

This also relates to your question about Copy. Things are copied bitwise at compile-time. So if we bitwise copy a reference, we don't get a new cell, we just get a new reference. So suddenly we could change one use of a constant and other uses would get changed.

In any case, when compiling a trait definition, pre-monomorphization, can the compiler determine where drop calls would be inserted? If so, perhaps const fn simply shouldn't be able to use any types that aren't copy except by reference? (Not knowing hardly anything about the compiler internals: does this sound like an overly complex check?)

Well... const fn are a separate beast. You can't have any Drop calls inside a const function anyway right now. At least not until we get impl const Drop for SomeType, but a type with impl const Drop is totally fine to drop in a constant, so we're fine. The check will stay for types that just have impl Drop

Ah, okay, that definitely helped me understand both the Cell issue and how const values actually work. (I'm sure I understood the distinction between const and immutable static values at some point, but I entirely forgot that const data is inlined.

Re: Cell, it looks like the compiler currently won't attempt compile-time analysis of a const fn to determine whether or not it contains a problematic value: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=6243cab09e8dadd35c435d65a299adf1

....so can we just keep that same rule (the code is rejected if a const borrow may contain a Cell) in all const contexts? I.e.:

const SOME_REF: &Option<Cell<i32>> = &{
    if /* some complicated logic */ {
        None
    } else {
        Some(Cell::new(7))
    }
};

...would be rejected before attempting to execute /* some complicated logic */ (be it a function call or something involving a loop), simply because the resulting value is of a type that may be invalid. The simple version you showed above, = &None, would be the only valid initialization expression for a const Option<Cell<_>>.

As an aside: I cannot fathom how a const type containing a Cell would ever be useful, except in pattern-matching. Do we intend to provide const-evaluation support inside patterns? That seems...overly complicated.

Re: Drop and monomorphization,

trait Bar {
    const BAR: Self;
}
trait Foo<T: Bar> {
    const FOO: u32 = (T::BAR, 42).1;      // Doesn't this imply either `T: const Drop` or `T: !Drop`?
}

It seems that Foo requires that T either impl const Drop or not impl Drop at all, but the author of Foo doesn't necessarily care which is true. Could the compiler simply reject any such traits (with potential drop calls in const contexts) unless the author specifies one of T: Copy or T: const Drop? Or, more generically, we could automatically derive const Drop (as we automatically derive Sized) for types that don't implement Drop? If that's not permissible, perhaps a new marker trait could be introduce to indicate "this type can be dropped in a const context", and that could be automatically derived?

so can we just keep that same rule (the code is rejected if a const borrow may contain a Cell) in all const contexts?

Seems a sensible precaution, but I also think that this will make ppl uncomfortable (similar to how if 3 < 2 { [42, 43][3]; } triggers an array index out of bounds error). We can lift the limitation further in the future backwards compatibly though, so it's not an issue

As an aside: I cannot fathom how a const type containing a Cell would ever be useful, except in pattern-matching. Do we intend to provide const-evaluation support inside patterns? That seems...overly complicated.

Well, you can use constants to initialize statics, and you might want to do some computation at compile-time, and that computation might want to memoize values to speed up an algorithm. The final result should still not contain a Cell.

Could the compiler simply reject any such traits (with potential drop calls in const contexts) unless the author specifies one of T: Copy or T: const Drop?

That's essentially what we're doing right now I think?

Or, more generically, we could automatically derive const Drop (as we automatically derive Sized) for types that don't implement Drop?

This is something we definitely should do. I mean fn foo<T: Drop>(_: T) {} works with foo(42i32) just fine, so this already exists to some extend, we'll just need to figure out how to handle the const part.

@oli-obk Sorry for the long absence. I'm still up for writing some docs, although at the moment I'm still not sure exactly what needs to be written.

Could the compiler simply reject any such traits (with potential drop calls in const contexts) unless the author specifies one of T: Copy or T: const Drop?

That's essentially what we're doing right now I think?

If that's the case, then doesn't that solve the "third problem" you described above, regarding dropping T during const evaluation?

trait Bar {
    const BAR: Self;
}
trait Foo<T: Bar> {
    const FOO: u32 = (T::BAR, 42).1;
}

This "problem" is solved by rejecting the above code. It's not so much a problem of the source, but of the analysis. The analysis needs to figure out which things are dropped and have a Drop::drop impl. Introducing loops will make this problem impossible to solve in the general case (because of the halting problem). We can introduce a pessimistic analysis that knows a bunch of cases and keeps erroring where it isn't sure. The system for such an analysis exists, but we haven't implemented it for checking constants yet.

@oli-obk Hello again! I just attended your talk at RustConf, and I believe you mentioned that there were PRs this week for const eval loop and if. I'm having a difficult time finding those; could you please post a link?

The related PRs are #63812 and #63860

Cross-posting from #49146:

Now that #64470 and #63812 have been merged, all the tools required for this exist in the compiler. I still need to make some changes to the query system around const qualification to make sure that it isn't needlessly inefficient with this feature enabled. We are making progress here, and I believe that an experimental implementation of this will be available on nightly in weeks, not months (famous last words ๐Ÿ˜„).

This issue is next up after the initial implementation of #49146 is complete.

Implemented in #67216.

Hi @rust-lang/lang

I want to propose to stabilize this at the same time as const_if_match (which has already been marked for stabilization). The reason I want to do both at the same time is to prevent users from resorting to recursion when a simple loop would have been sufficient. A few demos of iterative vs recursive in const fn with just if and loop and no other features:

Only the fibonacci is more readable imo, and that's due to its inherent recursive nature.

Stabilization of const loop also stabilizes while,while let, break and continue, but not for loops, as the latter require trait method calls for IntoIterator::into_iter and Iterator::next.

There are no additional control flow situatios to consider in addition to what was already discussed for if/match in const, as both looping and branching use the same dataflow logic and any problem in looping can be simplified to a problem in branching.

Seems good to me! Thanks for nominating; we should discuss this in the next lang team triage meeting. @ecstatic-morse, do you concur that this is ready?

Yep! As @oli-obk said, loops present no additional problems for dataflow-based const qualification, and I can't think of any other concerns that are unique to loops.

Would it make sense to prepare a stabilization report, even if it's rather minimal?

Let me re-phrase that: I think it would make sense to prepare a stabilization report, though it may not be very long:

  • outline the syntax that's being stabilized
  • reproduce the reasoning that @oli-obk referenced
  • copy over some of the tests that @oli-obk referenced

In some sense it's all here in the thread, but it'd be nice to pull it into one comment / document.

Conclusion from @rust-lang/lang triage meeting: we should move forward with stabilization report.

Summary

#![feature(const_loop)] will be stabilized using the approach to const qualification described here.

Specifically, loop and while (including while let) will become legal in all const contexts, as well as break, continue and labels. A const context is any of the following:

  • The initializer of a const, static, static mut or enum discriminant.
  • The body of a const fn.
  • The value of a const generic (nightly only).
  • The length of an array type ([u8; 3]) or an array repeat expression ([0u8; 3]).

Notably, this does not stabilize for loops, whose desugaring contains a call to IntoIterator::into_iter. Calls to trait methods continue to be disallowed inside const contexts (see rust-lang/rfcs#2632).

Examples

const fn fib(n: u64) -> u64 {
    if n == 0 {
        return 0;
    }

    let mut fib = (0, 1);
    let mut i = 1;
    while i < n {
        fib = (fib.1, fib.0 + fib.1);
        i += 1;
    }

    fib.1
}

const THOUSANDTH_PRIME: u32 = {
    let mut count = 1;
    let mut n = 1;

'prime:
    loop {
        n += 2;

        let mut div = 3; 
        while div*div <= n {
            if n % div == 0 {
                continue 'prime;
            }
            
            div += 2;
        } 
        
        count += 1;
        if count == 1000 {
            break n;
        }
    }
};

Const Qualification

See the stabilization report for #![feature(const_if_match)] for an in-depth discussion of const qualification as it is currently implemented. Dataflow-based const qualification works in the presence of loops as well as branching, so committing to it here shouldn't cause any additional concerns. As a result, the following code will compile despite the fact that never_returned is initialized with and repeatedly reassigned a value that needs to be dropped:

const _: Option<Vec<i32>> = {
    let mut never_returned = Some(Vec::new());
    let mut always_returned = None;

    let mut i = 0;
    loop {
        always_returned = never_returned;
        never_returned = None;

        i += 1;
        if i == 10 {
            break always_returned;
        }
    
        never_returned = always_returned;
        always_returned = None;
    }
}

Resource Limits

Miri will only execute a fixed number of statements (technically MIR basic blocks), before giving up and reporting that const-eval failed with a const_err lint. Nightly users can increase (or disable) the limit with the crate-global #![const_eval_limit] attribute.

Although it was technically possible to hit this limit before using recursion, it will be much more common once looping constructs are allowed in a const context. There should be some way for stable users to get around the limit as well. However, it's not yet clear what form this will take. See #67217 for discussion of #[const_eval_limit].

Implementation History

#64470 implemented a value-based static analysis that supported conditional control-flow and was based on dataflow. This, along with #63812, allowed us to replace the old const-checking code with one that worked on complex control-flow graphs. The old const-checker was run in parallel with the dataflow-based one for a time to make sure that they agreed on programs with simple control flow. #66385 removed the old const-checker in favor of the dataflow-based one.

#67216 implemented the #![feature(const_loop)] feature gate with the semantics that are now being proposed for stabilization.

Hmm, my major concern here is with the const_eval_limit attribute. It is my belief that our current approach on limits is rather flawed. I think it would be better to put the const_eval_limit on the const fn function that contains the loop, and thus enable client crates to invoke the function without having to increase their own limits. (I think the same holds for macro recursion limits and numerous other sorts of limits.)

I'm not sure if it makes sense to do this only for const eval limits, and maybe I can be persuaded that we should try to adopt this approach for all limits simultaneously.

I'm also not 100% sure what it would mean to have limits on the function level, though I think some sensible interpretation ought to be possible for miri.

An aside: while writing this, I realized that we should probably be stabilizing #![const_eval_limit] alongside looping. This part hasn't been mentioned in the lang-team meetings yet, so please read that section of the report.

put the const_eval_limit on the const fn function that contains the loop,

@nikomatsakis What would this mean for mutually recursive const fn, each with a different limit?

One alternative I saw to the per-crate attribute was an attribute on each "root" of constant evaluation, so a const, static, enum discriminant, etc. If you were to set #[const_eval_limit = 0] on one const, its initializer could call a given const fn as many times as it wanted, whereas another const initializer would still have the limit in place.

This probably indicates that we should consider stabilization of (some form of) const-eval limit attribute separately and block const_loop on that discussion. My intent wasn't to railroad this part through, I just didn't think of it.

@ecstatic-morse

I don't know what it would mean, but I can think of a few interpretations. =) You could add them up, I suppose, or just take the max of the limits as you descend. I think attaching to the root is better than crate but not as good as being able to attach it to a const fn so that clients don't have to know about it. Imagine like compute_logarithm_table or something, which might do a lot of work, but be invoked numerous times with different parameters.

Anyway, I agree with separating the stabilizations.

@ecstatic-morse can you update the stabilization report to not say that it is stabilizing the "instruction limit"? I think that was our consensus, and then I can kick off the FCP request.

@rfcbot fcp merge

I would like to propose stabilization of permitting loops in constant evaluation, as described by the stabilization report prepared here.

Team member @nikomatsakis has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

๐Ÿ”” This is now entering its final comment period, as per the review above. ๐Ÿ””

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC will be merged soon.

Stabilized in #72437 ๐ŸŽ‰