rust-lang/rust

Can mutate in match-arm using a closure

arielb1 opened this issue · 23 comments

Update from pnkfelix: Note that this is now fixed by NLL (see #50783). This bug just remains open because @pnkfelix thinks our policy is to keep NLL-fixed-by-NLL bugs open until we make NLL the default for rustc.


STR

fn main() {
    match Some(&4) {
        None => {},
        ref mut foo
            if {
                (|| { let bar = foo; bar.take() })();
                false
            } => {},
        Some(s) => println!("{}", *s)
    }
}

Actual Results

playpen: application terminated abnormally with signal 4 (Illegal instruction)

Ideally, the type of "foo" in the pattern guard should be &Option<&i32> rather than &mut Option<&i32>... then we could loosen the overly restrictive mutation rules in pattern guards.

We might be able to specifically check for this case at

fn check_for_mutation_in_guard<'a, 'tcx>(cx: &'a MatchCheckCtxt<'a, 'tcx>,
; the problem there is that it could end up spuriously detecting a mutable borrow.

Another possibility would be to make it impossible to move out of variables of type &mut _; this is probably something we want to do anyway to eliminate the distinction between let bar = foo; (which moves) and let foo : &mut _ = foo (which reborrows).

This compiles with no error today. Nominating.

Fix by MIR borrowck. Forgot to tag as I-unsound myself.

Oddly this seems to be on the road to be fixed (as in flagged as error) by -Z borrowck=mir, but then when you add -Z nll to the mix, the error is again missed.

The NLL analysis not wrong exactly, but there is a kind of implicit borrow that's not manifest in the MIR. In particular, the NLL analysis determines that the ref mut foo is only used in one particular basic block, and hence there are no overlapping uses of the value being matched.

That is, the MIR is sort of doing piecemeal tests that read the value of the option, but assuming that the option will not change in the meantime. We can guarantee that it won't by taking a borrow, but we don't do that. So effectively we get this:

let mut v = Some(4);

if v.is_none() { ... }

let mut p = &mut v;
(|| { p.take(); () })(); // clears `v`

let s = v.unwrap(); // panics here, but UB in MIR

except that the final unwrap() would be UB.

At least that's how I read it now.

for now I'm labelled this with the NLL labels. We may not be able to include a fix for this in the initial release, but it would be nice to try...

Seems like we have to fix this. I'm not sure just what the best way is -- but I think we have to represent the "implicit borrow" in the MIR.

Assigning to self and up'ing priority to P-high to reflect severity for deploying NLL.

I am currently exploring extending MIR with a way to express "borrow all the discriminants reachable from place" so that we can freeze the variant alone over the course of all patterns+guards for a given match. Hopefully that can address this bug.

@pnkfelix I've been having second thoughts on that approach. For example, consider this:

let mut b = true;
match b {
    true => ...
    false => ...
}

no "discriminant" is touched, but we want to know that no writes occur.

I think we should think again about idea of shared borrow of the entire value being matched and then having guards access their bindings through a &T (where T is the user-visible type of the binding). To handle ref mut bindings, we can consider creating (for the guard) a ref binding and using a cast to coerce the &'a &'a T to a &'a &'a mut T (which I believe are identical in power anyhow).

That last part is a hack, no question, but overall the approach is pretty elegant, and seems like it would cover all cases.

Thoughts?

Seems like my suggestion (from our private conversation) of an explicit EndBorrow would also handle the case you mention

But I’m willing to try your suggestion. Either direction has some hackery to it

downgrading back to P-medium. (We still plan to fix this for NLL deployment, but making it P-high s interfering with other procedures that the compiler team uses.)

I've been working on some prototype solutions to this problem. For the record (and maybe of use if we turn this into a mentoring task), here are the paths I've explored (and then migrated away from, in most cases):

  1. Add a new Place variant for the discriminant, so that we can model borrowing it. This doesn't work, because in general a place really wants to be a distinct field/spot in memory, but in general the representation for a discriminant can be mixed in with some other field (e.g. a NonZero when the discriminant is a boolean). So just adding a Place variant in the MIR is not a good idea here.
    • (Its possible that we could do some other refactoring later that would allow us to talk about higher level abstraction of places where a given place would not necessarily correspond to a concrete memory location. But that's too aggressive a tack to take for this bug on its own right now, IMO).
  2. Add new MIR statements for borrowing the discriminant of a place. One instructions starts the borrow, and another ends it. This evolved from first "borrow just the one discriminant of the place, assuming its an enum", then into "borrow all the discriminants recursively reachable from the place." But as niko pointed out, just borrowing the discriminants alone does not suffice.
  3. Add new MIR statement for a ReadForMatch(place), which has the meaning "I might read from anything (safely) reachable from place". Then start the code generation for eachmatch off with a temp = &match_input, and then each arm's pattern starts with a ReadForMatch(temp). This is what I'm still working on now.
  4. I believe that what @nikomatsakis was suggesting above was a variation on the previous idea, except instead of adding a new MIR instruction that models accessing the whole input, instead change the code generation for patterns so that they do their access through the temp that is created at the match start. I actually started playing with this approach as well, since they both involve creating the temp. The main issue, as @nikomatsakis pointed out, is dealing with ref mut's that are generated by the match itself; those will interfere with the other accesses through temp.

Just in case because I've talked with @nikomatsakis about this but never claimed the issue. So yeah, I'd like to tackle this one :).

OK, so what I personally had in mind for fixing this was something perhaps simpler than what @pnkfelix described. Basically, when we desugar a match, we would begin by creating a "dummy" reference that refers to the place being borrowed. So a match like this

match foo { 
   Some(ref mut x) => ... 
}

would desugar to create a dummy _D

_D = &foo

just before entering each arm, we would then have a fake use of the dummy. I guess we'd have to add a MIR for this, or maybe there is something we could use:

BBMatch {
  _D = &foo;
  _1 = GET_DISCRIMINANT(foo);
  SWITCH(_1, 0: BBarm)
}
...
BBArm {
  FakeUse(_D)
  // create bindings for arm here
  _x = &mut foo.downcast<Some>.0;
}

Because _D is a shared borrow, this should work fine with concurrent access through the original path foo (e.g., GETDISCRIMINANT(foo) and so forth). (Ideally, the optimizer can just strip _D away as dead-code, in fact, although maybe it wants to keep it for the aliasing guarantees it gives.)

One complication has to do with ref mut bindings and guards: these would be created (under #49870) as something like:

TMP0 = &mut foo.downcast<Some>.0
TMP1 = &TMP0

given that TMP0 is dead after this, there is no actual problem -- the &mut is always accessed via the shared borrow of TMP1. But the borrow checker won't see it that way, it will return an error because we have a shared borrow (to make _D) and then a mut borrow (to make TMP0) that overlap. We'll have to add some hack to cause it to look the other way.

But the borrow checker won't see it that way, it will return an error because we have a shared borrow (to make _D) and then a mut borrow (to make TMP0) that overlap. We'll have to add some hack to cause it to look the other way.

Isn't said hack basically a version of two-phase borrows? The immutable borrow of tmp0 is not an activation. We just need to make sure it fires in the correct place, and we might want to force an activation of all let mut bindings when the arm begins (that is not required for soundness, but migth be desirable to prevent accidental stabilization).

There are a few more corner cases there - I was meaning to write a document about the questions here, but never quite found the time for it. Hopefully soon(tm).

@arielb1

Isn't said hack basically a version of two-phase borrows? The immutable borrow of tmp0 is not an activation.

Yes, it is, I just didn't know that I wanted to model it that way. But you are correct that in a way it makes a lot of sense to do so.

There are a few more corner cases there

Can you give a few hints as to what you are thinking of? Are you saying that holding a shared borrow of the value being matched is not sufficient to guarantee the exhaustiveness checks suffice?

Thinking more about the two-phase borrow: basically, we would say "if the temp T that are stores the borrow is used in a &T, then it is never activated". We already restrict two-phase borrows to cases where the temp is used exactly once.

Just a quick note to make sure this information is preserved: here is an important variant of the example given by @nikomatsakis back in January:

fn main() {
    let mut b = true;
    match b {
        false => {}

        _ if { (|| { let bar = &mut b; *bar = false; } )(); false } => {}

        true => {}
        _ => panic!("oops"),
    }
}

Namely, it is what I have used, in tandem with some code to enable the "feature"/bugfix rust-lang/rfcs#1006 , to illustrate that we do indeed need to borrow the input on each arm.

Hmm actually for consistency with other bugs I suppose I should reopen this and tag it as NLL-fixed-by-NLL

(but I am going to remove it from the "NLL invalid code does not compile" milestone because it is fixed by NLL... Update: ah niko is too quick for me.)

NLL (migrate mode) is enabled in all editions as of PR #59114.

The only policy question is that, under migrate mode, we only emit a warning on this (unsound) test case. Therefore, I am not 100% sure whether we should close this until that warning has been turned into a hard-error under our (still in development) future-compatibility lint policy.