Basically, the idea is that there is no behavioral difference between raw pointers and Rust references; the former need to follow the rules of the latter, even though these rules are no longer checked at compile time. In other words, Stacked Borrows is "just" a model where Rust's static borrowck
rules also apply to raw pointers, even though these cannot be statically checked.
Collapsed for readability
- Hence the usefulness of the Miri interpreter, that goes and checks these properties at runtime.
So, basically, when a raw pointer (i.e., *const T
, *mut T
or ptr::NonNull<T>
, that I will write as *T
since there is no difference between them, except for variance (and NonNull
not being, well, NULL
)) is created, it has a "provenance":
fn shared_read_only<T> (it: &'_ T) -> *T { it as *const T as _ }
creates a pointer that up until the last time it is used, asserts the immutability of the pointee, in the same fashion that if the raw pointers was the shared reference it originates1 from.
1 Hence the term provenance.
-
fn unique_read_write<T> (it: &'_ mut T) -> *T { it as *mut T as _ }
creates a pointer that up until the last time it is used, asserts the absence of other "valid"/usable pointers to the pointee. Hence the pointee is read-only between writes of this pointer, and the writes are data-race free.
-
fn aliased_read_write<T> (it: &'_ UnsafeCell<T>) -> *T { it.get() as _ }
creates a pointer that does not assert either, it only asserts that whenever it is used there is not another pointer being used simultaneously / in parallel, so both the reads and writes are data-race free.
-
pointer "copies" are mainly (unchecked) reborrows, so the copied pointer inherits from its parent properties / assumptions, and in the case of the exclusive pointer, temporarily invalidates its parent pointer.
There must be a way for an exclusive_read_write
pointer to "decay" to an aliased_read_write
(in other words, an exclusive_read_write
pointer could have copies aliasing each other), but that part is still a little fuzzy in my head. That's why in general I prefer to use explicit-ly UnsafeCell
-ed pointees.
The whole idea of reborrowing is what leads to a stacked model.
So, for instance, the following program is UB, according to Stacked Borrows:
let mut x = 42;
let at_x: *const i32 = &x; // shared_read_only
let _ = &mut x; // asserts/requires unique access to `x`, hence invalidates at_x
println!("{}", *at_x); // UB! use of invalid pointer
which can also be seen as:
let mut x = 42;
let at_x: *const i32 = &x; // 1. shared_read_only --------+
let _ = &mut x; // unique access? Not possible ---------->|
println!("{}", *at_x); // 2. used up until here <---------+
Which is the typical example of incompatible borrows in classic Rust. In other words, nobody is suprised that the following program does not compile:
let mut x = 42;
let at_x: &'_ i32 = &x; // 1. shared_read_only -----------+
let _ = &mut x; // unique access? Not possible ---------->|
println!("{}", *at_x); // 2. used up until here <---------+
This whole model was designed to deduce / obtain stronger non-aliasing guarantees for more aggressive compiler optimisations (in the examples above, that const-propagation could replace *at_x
with 42
, which is not possible to always do soundly if somebody gets an exclusive and thus mutable reference to that 42
).
For instance, here is another example of UB:
let at_x_mut: &'static mut i32 = Box::leak(Box::new(42));
let at_x_raw: *mut i32 = &mut *at_x_mut; // exclusive_read_write
let do_stuff = unsafe {
// Safety: x is never freed, so at_x_raw never dangles (it's a `&static mut i32`)
move || *at_x_raw = 0
};
let at_x: &'_ i32 = &*at_x_mut; // reborrow (and thus usage) of the reference `at_x_raw` originates from, thus `at_x_raw` gets invalidated
let result = delta(
&*at_x_mut,
do_stuff, // if called, invalidated `at_x_raw` gets used: UB
);
Indeed, here is the function delta
:
fn delta (at_x: &'_ i32, do_stuff: impl FnOnce()) -> i32
{
let prev_x = *at_x;
do_stuff();
*at_x - prev_x
}
Given that at_x
is a shared reference to a type not having shared mutability (no UnsafeCell
), at_x
is a reference to immutable memory, thus the compiler is free to assume that *at_x
never changes and thus optimize delta
into:
fn delta (at_x: &'_ u8, do_stuff: impl FnOnce()) -> u8
{
do_stuff();
0
}
thus getting result = 42
or result = 0
depending on whether this optimization happened: UB.
So Stacked Borrows is just saying that this program is UB for the same reasons that if at_x_raw
had been a &'_ mut i32
reference, borrowck would not let that program compile.
The natural reaction at this point is:
what's the point of using raw pointers if we cannot escape the rules of Rust references?
To what there are two answers (not counting the motivation of this "stricter model" enabling more agressive optimizations):
1- Favor shared mutability (&UnsafeCell
) to exclusive mutability (&mut
)
- to get
shared_read_write
s rather than exclusive_read_write
s
If you do not like that these optimizations can kick in and make your code UB, and you wish to be able to use raw pointers C-style (and honestly, this cautious approach should be chosen by everybody to start with), just avoid using &mut
and thus deriving raw pointers from it by using UnsafeCell
instead.
Indeed, the previous example can be made sound with the following pattern:
use ::core::cell::Cell;
fn main ()
{ unsafe {
let at_x_mut: &'static mut i32 = Box::leak(Box::new(42));
let at_x_cell: &'static Cell<i32> = Cell::from_mut(at_x_mut);
let at_x_raw: *mut i32 = &*at_x_cell as *const _ as _; // shared_read_write
let do_stuff = unsafe {
// Safety: x is never freed, so at_x_raw never dangles (it's a `&static Cell<i32>`)
move || *at_x_raw = 0 // at_x_cell.set(0);
};
let at_x: &'_ Cell<i32> = &*at_x_cell; // `at_x_raw` does not get invalidated since it does not require uniqueness
let result = delta(
at_x,
do_stuff, // if called, invalidated `at_x_raw` gets used: UB
);
}}
fn delta (at_x: &'_ Cell<i32>, do_stuff: impl FnOnce()) -> i32
{
let prev_x = at_x.get();
do_stuff();
at_x.get() - prev_x // cannot be optimized because the pointee is not immutable!
}
2 - Raw pointers can be useful to circumvent borrowck overly conservative choices
- And when dealing with uninitialized memory, they avoid asserting the validity of the pointee; see uninit - Rust
Indeed, there are other coding patterns that Rust refuses to compile despite them being valid (mainly about a move of a pointer conservatively assuming that the pointee has been moved / dropped; this is thus related to unsafe
code relying on Pin
):
For instance, the following program fails to compile:
let boxed_x = Box::new(42);
let at_x: &'_ i32 = &*boxed_x;
let new_boxed_x = boxed_x; // "move" pointer from one place of the stack to another, may even be a no-op.
assert_eq!(*at_x, 42); // Error
This is where raw pointers are useful:
let boxed_x = Box::new(42);
let at_x: *const i32 = &*boxed_x;
let new_boxed_x = boxed_x;
assert_eq!(*at_x, 42); // Should be fine?
Now, this is not yet officially sound because Box
itself asserts that it is not aliased either, much like a &mut
, and the move of Box
does count as a usage point that invalidates the borrow, so we are back to UB for the same reasons that the previous program did not compile. Note that this is a WIP, and this may could be changed. It is, for instance, the reason behind https://docs.rs/owning_ref being theoretically unsound, and also behind Miri going crazy with self-referential structs, such as those compiler generated for an async fn
's state / locals-that-survive-an-await-point (Miri and .await
aren't best pals yet).
One way to make the previous code sound (and owning_ref
too, for that matter), is to define one's own AliasedBox
that does not assert uniqueness:
extern crate alloc;
use ::alloc::{
boxed::Box,
};
use ::core::{
mem,
ptr,
};
pub
struct AliasedBox<T : ?Sized> (
/// Does not assert non-aliasing for the **inner** life of this AliasedBox.
ptr::NonNull<T>, // covariance is fine because ownership
);
impl<T : ?Sized> From<Box<T>> for AliasedBox<T> {
fn from (boxed: Box<T>) -> Self
{
Self( Box::into_raw_nonnull(boxed) )
}
}
impl<T : ?Sized> Drop for AliasedBox<T> {
// pointer must no longer be aliased at this point
fn drop (self: &'_ mut AliasedBox<T>)
{
unsafe {
drop::<Box<T>>(Box::from_raw(self.0.as_ptr()))
}
}
}
impl<T : ?Sized> AliasedBox<T> {
// Not a From impl because of `Box` being fundamental.
// pointer must no longer be aliased at this point
pub
fn into (self: AliasedBox<T>) -> Box<T>
{
unsafe {
Box::from_raw(mem::ManuallyDrop::new(self).0.as_ptr())
}
}
}
// Aliasing pointers can only be used for reads for the lifetime of the deref
impl<...> Deref ... { type Target = T; ... }
// Aliasing pointers cannot be used for the lifetime of the deref_mut
impl<...> DerefMut ... { ... }
The mental model justifying this, (which, by the way, is what both Rc
and Arc
do, but with runtime checks), is that the exclusive_read_write
pointer of the Box
can be "mentally" downgraded to a shared_read_write
, for AliasedBox
, from which multiple aliasing pointers can exist, and at the end of life of the AliasedBox
(be it to upgrade it back to Box
or to drop
it (by so doing)), the pointer does assert non-aliasing to upgrade back to the exclusive_read_write
that Box
needs).
And then:
let boxed_x: AliasedBox<i32> = Box::new(42).into();
let at_x: *const i32 = &*boxed_x;
let new_boxed_x = boxed_x; // does not drop, so no need for uniqueness
assert_eq!(*at_x, 42); // Is fine!!