Why is this code allowed? Move out of and re-assign to mutable box

Hello!

Sorry in advance if my question has been answered elsewhere - I tried to look around for it, but only dug up questions regarding mutable references. Apologies as well if it turns out I was just over-thinking something very basic...

I was recently working on a side-project that involved creating a tree-like data structure, and somewhere along the way, I ended up writing something I was sure the borrow-checker would have a problem with. But to my surprise, it seemed to run and compile just fine! Here's a simplified version of what I was doing:

#[derive(Debug)]
struct SomeStuff {
    value: i32,
    next: Option<Box<SomeStuff>>,
}

impl SomeStuff {
    fn new(value: i32) -> Box<Self> {
        Box::new(SomeStuff { value, next: None })
    }

    fn with(value: i32, next: Box<SomeStuff>) -> Box<Self> {
        Box::new(SomeStuff {
            value,
            next: Some(next),
        })
    }
}

fn handle(values: &[i32]) -> Box<SomeStuff> {
    let mut head = SomeStuff::new(*values.last().unwrap());
    let len = values.len();
    for &x in values[0..(len-1)].iter().rev() {
        let next = SomeStuff::with(x, head);
        head = next;
    }
    head
}

I was a bit surprised that this worked, sinceI thought moving out of head would invalidate the variable. I tried to add a little print debugging:

let mut head = SomeStuff::new(*values.last().unwrap());
let len = values.len();
for &x in values[0..(len-1)].iter().rev() {
    let next = SomeStuff::with(x, head);
    println!("next={next:?}, head={head:?}");
    head = next;
}

Unsurprisingly, this didn't work at all, since I was now trying to observe head in an invalid state.

(In the end, I removed the print and simplified the loop body down to head = SomeStuff::with(x, head).)

Intuitively, I understand why this should be allowed. It's as if I'd written this without a loop:

let head = Box::new(tail());
let head = Box::new(prev(), head);
let head = Box::new(prev(), head);
...

and trying to do this inside the loop would just result in me shadowing the outer variable.

So my questions are

  1. Is there a documented reason this is allowed?
  2. Has this always been allowed since 1.0, or was there something like an improvement to the borrow checker that allowed this?

Looking back at the Rust Book and Rust Reference though, I can't find anything that gives me a clear picture of how this would be modeled. My best guess is that while the original binding for head was invalidated by moving it, mut might allow me to create new bindings - as if I were able to shadow it with a new name.

So it might be that simple, but it does feel like a slightly... "dynamic"? thing for older versions of the borrow checker to have picked up on - seeing that head is invalidated when I move out of it in the loop, but noticing that re-bind it before the end of the loop, making it valid again by the start of the next loop iteration. I guess it can't be that bad if it works currently though :person_shrugging:

Rust tracks whether variables are initialized:

let x;
if false { 
  x = 1;
} else {
  x = 2;
}
println!("{x}");

and it's also possible for a variable to go from initialized back to uninitialized:

    let mut var = String::new();
    loop {
        drop(var);
        // var is uninitialized here
        var = String::new();
    }

as long as the control flow can be statically analyzed (there are no code paths where it may be observed uninitialized).

4 Likes

Thanks for your reply! I'll put that in the "overthinking something basic" pile :slight_smile:

Put in those terms, that is pretty clear.

1 Like

Well, it is not trivial that Rust supports this. For example, the compiler sometimes has to insert hidden “drop flag” variables to remember whether a variable is initialized at run time, so that it can be correctly dropped, or not, when the function returns. An example of when this is necessary is:

let s1;
let s2 = if cond() { 
    "hello";
} else {
    s1 = String::from("goodbye");
    s2 = &s1;
}
println!("{s2}");

(This kind of code is useful to minimize allocations when you sometimes have an existing value to borrow and sometimes doesn't.) The String in the variable s1 must be dropped at the end of this block if it was initialized and not if it wasn't. So, there is a drop flag associated with s1 to decide whether to do that.

4 Likes

Thank you for the example!

In terms of having a mental model of what rust can handle as a typical user, "rust tracks when variables are/aren't initialized, when possible" is pretty straight forward. This is a good example of when implementing that can be tricky.

Even in Kornel's example with the loop { } - it's easy for me to see why that should work, though I don't know if I'd call static analysis over a loop trivial in any sort of circumstance. I can remember doing those sorts of proof by hand in university, and those weren't the worst to wrap your head around, but when you're designing an algorithm that can handle arbitrary code, I'm sure that's no "trivial" job :slightly_smiling_face:

I appreciate you addressing some of the subtler points involved though, since, in my original question, I was curious about the "trickiness" of being able to handle those sorts of situations.

1 Like

It can't handle arbitrary code.[1] For example,

let cond = ...;
let x;
if cond {
    x = String::from("hello");
}
if cond {
    println!("{x}");
}

This program "obviously" will never access uninitialized data, but the compiler doesn't accept it. The general principle here is: the compiler looks at all possible control flow, where the definition of “possible” ignores all data values — it doesn't care when an if or match will choose one branch or the other, and thinks about the program as if all branches were if flip_a_coin() {...


  1. This is, in general, how compilers dodge the halting problem (which has as an implication that almost all questions about programs cannot be answered for all programs) — there is always a limit where the compiler will reject some valid-in-principle programs. We programmers then have the job of writing programs that don't just compute what we want, but also do it in a way that is comprehensible to the compiler. Luckily, this works well together with writing programs in a way that is comprehensible to other humans, even if neither is a subset of the other. ↩︎

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.