Why the lifetime wouldn't prevent the assignment in this case?

fn main() {
    struct A<T: 'static + ?Sized>(&'static T);

    impl<T: 'static + ?Sized> A<T> {
        fn get<'a>(&'a self) -> &'a &'a T {
            &self.0
        }

        fn set<'a>(&'a mut self) -> &'a mut &'a T {
            unsafe { &mut *(&mut self.0 as *mut &T as *mut () as *mut &T) }
        }
    }

    let s = Box::leak(Box::new(vec![1, 2, 3, 4]));
    let mut a = A(s);
    println!("{:?}", a.set());
    {
        let c = Box::new(vec![3, 4, 5, 6, 7, 8]);
        let ptr = &*c as *const Vec<i32>;
        *a.set() = &c;  //Obviously the lifetime of `c` doesn't live as long as 'a, but why the assignment is permitted?
        drop(c);  //Why can we drop `c` here? Doesn't `&c` has been captured by `a`?
        println!("{:?}", unsafe { &*ptr });  //Why this dangling reference still be able to read the released value in runtime?
    }
    println!("{:?}", a.get());
}

Follows are the outputs, which surprisingly seemed to work normally?!

[1, 2, 3, 4]
[3, 4, 5, 6, 7, 8]
[3, 4, 5, 6, 7, 8]

Is there anyone could explain these phenomenons? Thanks.

Your set is unsound, so you're seeing undefined behavior. But I'm not sure how to explain what you're doing.

I'm just trying to convert &'static T to &'a T XD

But in the process you're allowing a reference with a shorter lifetime to be assigned to a reference that could live forever.

Unsafe lets you violate the lifetime rules, which is why it's so dangerous.

1 Like

Yes I know, but in this example I could manually guarantee a lives through the program, it just for testing. The point is why this assignment won't be prevented by rustc?

The lifetime 'a is just the duration of the borrow for the call to set, which isn't very long.

I'm sorry but I don't think it was the key that rustc allowed this assignment, obviously in this scope c lives shorter than a.

But the lifetime 'a of the borrow involved is less than the lifetime of the variable a.

The last use of a is before the drop of c, so there's no problem.

Nope, I used a here again:

    //...
        drop(c);  //Why can we drop `c` here? Doesn't `&c` has been captured by `a`?
        println!("{:?}", unsafe { &*ptr });  //Why this dangling reference still be able to read the released value in runtime?
    }
    println!("{:?}", a.get()); //I used `a` here again!
}

But I bound 'a to a by using &'a mut self. This could prove. (Not in the code I given at the initial).
image

That just means that the variable a must outlive the reference created by set. But that reference is gone the moment you assign to it here:

The lifetime of c remains unrelated to the lifetime of a, because the borrow is finished, and there is nothing left with the lifetime 'a.

I had a similar confusion when I started, thinking that lifetimes were a property of a variable. Variables can have lifetimes, if it's part of their type, but none of these do. The lifetime 'a only refers to the time that the borrow lasts, which must not outlive the variables a or c, which is satisfied in your code, because both outlive that one line of code in which the borrow occurs. (Or rather in their the reference exists.)

What’s happening here

*a.set() = &c;

is: We have a function

fn set(self: &'a mut A<Vec<i32>>) -> &'a mut &'a Vec<i32>

and a value a: A<Vec<i32>, as well as c: Box<Vec<i32>>.

The expression a.set() desugars to A::set(&mut a), and &mut a has type &'short1 mut A<Vec<i32>> for some short lifetime 'short1. Also, &c has type &'short2 Box<Vec<i32>> for some short lifetime 'short2. It gets converted implicitly to &'short2 Vec<i32> (by dereferencing) due to the assignment to *A::set(&mut a) expecting a &Vec<i32>.

Now, since &mut a is a &'short1 mut A<Vec<i32>>, the function A::set returns a &'short1 mut &'short1 Vec<i32>, and you can – in principle – only assign a &'short1 Vec<i32> to this reference.

What you are assigning is a &'short2 Vec<i32>, so the compiler infers a constraint 'short2: 'short1 that must be met. Why would this be enough? Well, if 'short2: 'short1 (read “'short2 outlives 'short1”), then &'short2 Vec<i32> can be coerced into &'short1 Vec<i32>; the result of the coercion could be assigned to *A::set(&mut a)!

But is this contraint actually problematic? No! 'short1 and 'short2 are both the lifetime of some short-lived borrows being constructed here; they might as-well be the same lifetime to begin with. And even when you start out with two unrelated lifetimes, the following still compiles:

struct A<T: 'static + ?Sized>(&'static T);

impl<T: 'static + ?Sized> A<T> {
    fn get<'a>(&'a self) -> &'a &'a T {
        &self.0
    }

    fn set<'a>(&'a mut self) -> &'a mut &'a T {
        unimplemented!()
    }
}

fn demonstration<'short1, 'short2>(a_ref: &'short1 mut A<Vec<i32>>, c_ref: &'short2 Vec<i32>) {
    *A::set(a_ref) = c_ref
}

Why? Because a_ref can be coerced into &'even_shorter mut A<Vec<i32>> by re-borrowing … implicitly … before it’s passed to A::set, for any even shorter lifetime 'even_shorter such that 'short1: 'even_shorter. And similarly, c_ref can be coerced into &'even_shorter Vec<i32> as long as 'short2: 'even_shorter is ensured. Now the lifetimes match, and everything compiles fine.


Really, ordinarily a reference foo: &'b mut &'c T allows assigning any &'d T to its target, as long as the lifetime of 'd is long enough, more precisely as long as 'd: 'c. So there’s a lower bound – crucial for soundness – of how long-lived any reference assigned to the target of foo must be at least.

The devastating thing that your unsound set method now offers is: It returns a &'a mut &'a T for any given &'a mut self. The borrow of self now can be arbitrarily short, and the second 'a in the return type plays the role of a lower bound, as explained above. In effect, the returned reference places an arbitrarily short lower bound on the lifetime of any reference that’s allowed to be assigned to its target, so effectively it doesn’t place any restriction at all anymore. With the function signature of the function <A<T>>::set in your code, you allow any &T reference of any lifetime to be assigned.


My personal suggestion: The subtleties of lifetimes in Rust are hard. Writing sound API with lifetimes, but using unsafe code is very nontrivial. (And by the way, writing sound Rust code when using raw pointers is also quite non-trivial.) If you want to feel empowered by Rust and not confused, just don’t use unsafe code :wink:


Regarding runtime behavior, your question “Why this dangling reference still be able to read the released value in runtime” is easy to answer. For once, things are’t even working out properly, as you are not getting [3, 4, 5, 6, 7, 8], but other random numbers, if you run your code in the playground. Practically speaking, you’re reading memory from the heap after it has been freed, which practically means that the allocator can write meta-information about the heap into there. Maybe it works different on your platform and the allocator there stores it’s information elsewhere, who knows. In principle however, what you’re observing here is undefined behavior. You’re violating the basic rules of unsafe Rust and doing something that’s forbidden (reading memory after it’s freed), so the behavior of the program could really be anything. It could crash, give random results, make demons fly out of your nose, or – as you observed on your computer – behave surprisingly nonremarkably and hide the fact that with any change of platform or Rust version things could start to go horribly wrong.

Since you’re using unsafe which means as much as you promising “I’m an advanced programmer in need for really mighty language feature, but in turn I don’t expect the compiler to help me catch any subtle errors when I make them, and I live with the consequences”, you’re getting no less than you promised to expect: No help from the compiler, and a subtle error with subtle consequences. Fortunately there’s help in Rust. If you do a test-run of your program in miri (also available under “Tools” in th playground), you will get an error message saying that your program does “dereference a pointer after its allocation got freed”.

    Finished dev [unoptimized + debuginfo] target(s) in 2.53s
     Running `/playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/bin/cargo-miri target/miri/x86_64-unknown-linux-gnu/debug/playground`
error: Undefined Behavior: pointer to alloc3972 was dereferenced after this allocation got freed
  --> src/main.rs:22:35
   |
22 |         println!("{:?}", unsafe { &*ptr });  //Why this dangling reference still be able to read the released value in runtime?
   |                                   ^^^^^ pointer to alloc3972 was dereferenced after this allocation got freed
   |
   = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
   = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
           
   = note: inside `main` at src/main.rs:22:35
   = note: inside [……]

error: aborting due to previous error
10 Likes

Thank you very much for explaining so detailed. :grinning:
But I have an another question, how to make a self-reference structure safely?

It's not possible to make a self-referential struct in safe Rust, at least not if you are using references. (It's possible if you use e.g. indexes into a vector or Rc instead of references.)

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.