Why is an immutable reference to a vector not allowed in this case?

Suppose you have a mutable vector v of some type T with a few elements.

v ist a stack variable containing 3 machine words (pointer to element buffer on heap, capacity and length). The elements live in a slice in heap memory.
I create an immutable reference to vector v, the reference contains the address of variable v (as the Debugger says)
Via deref coercion I can have direct read-only access to the elements.

So far, so good. When I push more elements to the vector, the vector allocates a new buffer on the heap.
Now I want to access the elements via my immutable reference, but Rust refuses to compile the code.

Even if a new buffer is allocated at a different address on heap, the reference should be able to read the new slice via automatic deref coercion. Note: the reference contains the address of v on the stack and not the address of the former, now invalid slice.

If my reference pointed to one or more elements, e.g. via let myref = &v[0], it would point directly to heap memory, so the compiler would reject my code, because the memory pointed to may be invalid.
But in my case, using &v would be deref coerced again when used, and I should be taken to the new address of the slice.

Can someone explain this restrictive behavior of the compiler?

3 Likes

The rule is that whenever something is borrowed, it doesn't change (except through the reference if mutable). Once v is borrowed, no distinction is made between the implications of reference to v itself, and to its elements.

This is so that any code which receives any kind of &T reference can freely access its (transitive) contents without thinking about any interactions with mutations; and this applies both to the programmer working with Rust code, and the optimizing compiler turning it into efficient machine code. (The whole Rust borrowing paradigm is about giving both of those parties the benefit of no mutation surprises.)

Another way of looking at it, not involving any transitive properties, is to notice that when you push to the Vec, the Vec's len field is mutated; that itself is directly prohibited by the Rust memory model — but if the length were stored in the heap instead, that would still not make it okay.

The only time you can have a collection which is mutated while an & to it is held is when the collection is interior mutable — and the tradeoff there is that you can't have an & reference to an element (unless the collection is append-only and designed to support this).

11 Likes

This means that the code I described, wouldn't theoretically cause any safety issues, but Rust does not want me to be surprised when my reference reads the new vector elements after the slice has been changed?

Picky, picky...

Thanks for your answer!

References from a Vec are basically just pointers pointing directly at the Vec's heap-allocated elements, and pushing a new element into the Vec might require a realloc which would invalidate all those pointers. So the requirement isn't just Rust being needlessly picky about things, because keeping those pointers valid at all times is very important.

Whether Vec implements deref doesn't really factor into it.

2 Likes

The immutability rules for & references enable some important compiler optimizations that would be unsound if your code were allowed. In particular, the compiler may choose to keep the heap pointer in a register instead of reading it off of the stack for every access; if your code were allowed, this would much more complicated to accomplish, and outright impossible in some situations.

4 Likes

Not particularly picky. Rust just doesn't distinguish between "stack" and "heap" references, and lacking that, it's not possible to even express your wish using the type system.

Having that property encoded in the type would cause nothing but pain, by the way. It would, for example, prevent you from storing your data in arbitrary places except for "the" heap as defined by the built-in allocator. It would also by and large make it impossible to encode the borrowing of elements in a heap-allocating collection, because such references point to the heap but are still derived from a stack reference.)

Your code is very unconventional, judging by the description. This simply never comes up in practice if you are writing idiomatic code. If you aren't touching the elements in the meantime, why do you insist on keeping the reference?

1 Like

Yes, I agree. And I stated this in my question. A &Vec ist coerced to &[T], so I access heap memory that may bei reallocated. And Rust prevents me from accessing freed memory after my vector slice has been extended and reallocated after some pushes.

But my Debugger shows the Stack address of my vector variable as the content of the reference variable during the whole lifetime of the reference. And I assumed that everytime I use this reference, it will be deref coerced to a pointer to the actual elements, reallocated or not.

I believe you're asking about a program like

// types and conversions made explicit for clarity
fn main() {
    let mut v: Vec<i32> = vec![1, 4, 9];
    
    let v_ref: &Vec<_> = &v;
    println!("&v_ref: {:?}", v_ref as &[i32]); // A
    
    v.push(16);

    println!("&v_ref: {:?}", v_ref as &[i32]); // B
}

and that your question boils down to why A is allowed but B is not. If that's the question, then the types and locations don't really matter as much as it might feel like they do: this is actually about the semantics of borrowing in Rust, and not about what types or values are actually borrowed.

Rust's ownership system is meant, in part, to address two categories of programming mistake:

  • Use-after-free, where a reference to a value is mistakenly written to or read after the value it references has been destroyed, and
  • Aliasing bugs, where two references that are mistakenly assumed to point to distinct values when they may actually be pointing to the same value.

The way that Rust does this is with a pair of rules:

  • No value can be moved or dropped when a live reference exists, and
  • Mutable references to a value cannot be live at the same time as any other reference to the same value.

In the example program above, v_ref - an immutable reference to v - must necessarily exist until the end of B, because it's used in B. The call to v.push, however, requires a mutable reference to v (so that the vector can be mutated). Rust's rules prohibit creating that mutable reference while v_ref exists, as above.

The program I've written is "reasonable" in so far as there's no way for the aliasing between v_ref and the short-lived mutable reference to v created by v.push to cause any actual incorrect reads, but Rust doesn't know that. The language prohibits this categorically, and not based on any analysis of the specific references and values involved.

6 Likes

If you have cached a reference to the Vec, then that won't deref-coerce to a different buffer when you access it. Ie., if you say:

let v = vec![...];
let s: &[_] = &v;

then the only sensible thing that should happen is that the deref coercion happens where and when you create the coerced variable. Right there and then. Not every time you access the already-created variable s. Why should merely accessing a slice magically re-assign it to a different address? That doesn't make any sense, and even if it did make sense, it would be way too magical to be acceptable.

Furthermore, it's impossible to implement. If this were how slices worked, then every slice would have to know what collection it came from. It would have to know when it came from an array, a Vec, a VecDeque, a boxed slice, or MyOwnSliceBackedCollection. That's just infeasible – the compiler wouldn't – and couldn't possibly – know of the reallocation semantics of every custom type ever.

The allocator might be able to keep all your Vec's elements in the same place after running them through a realloc to make room for new elements (which keeps all those pointer addresses unchanged, as shown in your handy debugger), or it might not. Because it's not guaranteed, the compiler has to assume they might get shuffled around from one element push to the next.

Aaaaahhh...this could bei the explanation. Under the hood the compiler accesses the former heap pointer possibly stored in a register.
Tricky, Tricky...

In order to be safe the compiler must be restrictive by default and only selectively allow the behaviour that can be proven safe. In your case the behaviour could be proven safe, but arguably we gain almost nothing by allowing it: it would complicate the rules and the compiler, while only allowing weird code that already has simple workarounds (just recreate the reference)

3 Likes

1000 thanks to you all !! Now I see clearly. Of course the code I presented here is silly. But I come frome C \C++ and I Always try to understand every single peculiarity of programming. So its me, who is picky, picky

1 Like

You are completely right, when you declare your reference as &[_]. This ist a reference to the slice in heap and the address stored will not change. But I talked about a reference of type &Vec, and everytime I use this reference, I theoretically get the invariant address of the 3-field vector variable, which will be coerced to the one pointing to the actual underlying slice.

1 Like

It's a fundamental rule about &T that it means the T won't be mutated, and you are free to read it, and anything else you can obtain via it, any time and (logically if not literally) cache the results. Therefore, having a &Vec<T> should be understand to already incorporate all the implications of having the &[T] you can get from it, because you might have already gotten it.[1]


  1. Well, sort of. Deref implementations aren't guaranteed to be stable — they can return a different reference each time they're called. But there aren't very many ways to take advantage of that constructively, and this is more of a semantic “bug” than a “feature”. ↩︎

5 Likes

In that case I suspect we need to go back to the core issue: it's not possible to write compiler which would accept all “safe” code and reject all “unsafe” code (for any definition of “safe” and “unsafe”).

And it's not “we don't know how to do that yet” or “we couldn't do that because it would be too slow” or anything like that. Nope. That's just simply fundamentally not possible. Got it?

And then Rust solution: if we couldn't create a compiler that accepts all “safe” code and rejects all “unsafe” code then we may, at least, split language in two parts:

  1. Normal, “safe” Rust accepts only “safe” code, but also rejects some safe code.
  2. On the flip side unsafe Rust accepts all imaginable “safe”, valid, code, but also accepts some code which simply doesn't work.

And then answer to your question is obvious: your construct is not allowed because it's not beneficial. It's not clear what idioms such change to the rules would enable and why couldn't you write your program in some other fashion which would have similar efficiency properties.

2 Likes

Yeah, this is another point to consider. Instead of pushing new values to the vector, I could move it to another Vec instance, so the moved-from vector becomes invalid and the reference is dangling. I forgot to think of this possibility.

The replies from @derspiny and @2e71828 are the ones making things clearer for me.

Thanks for all the effort you all put into helping me. :+1:

Thanks for this

This is eye opening, Thanks

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.