If you run the above with miri, you will see that it encounters UB. As I understand it, this is because when we call get_value, we create a unique reference to SelfRef, and hence we assert exclusive access to the struct, including all of its fields. This invalidates the *const u64 we stored in the struct, as it was created before we asserted unique access to value.
Note that in the above I use that an &mut UnsafeCell<T> still asserts exclusive access to the T. As far as I can see, this is indeed the case. The assumption that UnsafeCell turns off is the opposite, namely that an &UnsafeCell<T> does not assert that the T is immutable.
What does the compiler even do when it constructs a future with async/await? As far as I can see there is simply no way to fix the above code.
(Note that if we allow boxing of the value field, it is ok as the &mut SelfRef does not assert exclusive access through raw pointers recursively.)
As far as making miri happy is concerned, something like
fn get_value(self: Pin<&mut Self>) -> u64 {
let me = unsafe { self.get_unchecked_mut() };
let ptr = me.ptr;
assert!(!ptr.is_null());
let _me = me as *mut _;
unsafe { *ptr }
}
is enough.
Furthermore, finding an approach that would make my personal intuition of “sound unsafe code” happy, one could go this far:
fn get_value(self: Pin<&mut Self>) -> u64 {
let me = unsafe { self.get_unchecked_mut() };
let ptr = me.ptr;
assert!(!ptr.is_null());
let _me = me as *mut _ as usize;
let ptr_usize = ptr as usize;
unsafe { *(ptr_usize as *const u64) }
}
Also, I really have to put reading into the details of stacked borrows higher on my TODO list.
Does replacing ptr_usize with _me + ptr_usize - _me (all of these are integers!) really make a semantic difference? Also, this is somewhat of an LLVM question now, right? (Plus the question of what kind of LLVM Rust emits..)
Ah, okay.. well, I was thinking ahead here to a situation where we don’t know anymore if ptr points directly to a field of me or to some (other) heap location instead. I can imagine self-referential structs where both possibilities can happen.
Just casting to integers and back to pointers seems incredibly sketchy to me. Anyway, even if we considered adding offsets to me for this case, there are other cases that also ought to work when pinned, but have the same issue without this workaround.
use std::cell::UnsafeCell;
use std::pin::Pin;
use std::marker::PhantomPinned;
struct PairSelfRef {
value: UnsafeCell<u64>,
ptr: *const u64,
_pin: PhantomPinned,
}
fn link_up(left: Pin<&mut PairSelfRef>, right: Pin<&mut PairSelfRef>) {
let left = unsafe { left.get_unchecked_mut() };
let right = unsafe { right.get_unchecked_mut() };
left.ptr = right.value.get();
right.ptr = left.value.get();
}
impl PairSelfRef {
fn new(value: u64) -> Self {
Self {
value: UnsafeCell::new(value),
ptr: std::ptr::null(),
_pin: PhantomPinned,
}
}
fn get_value(self: Pin<&mut Self>) -> u64 {
let me = unsafe { Pin::into_inner_unchecked(self) };
assert!(!me.ptr.is_null());
unsafe { *me.ptr }
}
}
fn main() {
let pair_1 = PairSelfRef::new(10);
let pair_2 = PairSelfRef::new(20);
futures::pin_mut!(pair_1);
futures::pin_mut!(pair_2);
link_up(pair_1.as_mut(), pair_2.as_mut());
println!("{}", pair_1.get_value());
println!("{}", pair_2.get_value());
}
You may also want to use the aliasable crate which provides an AliasableBox so you can reduce usage of unsafe and not have to implement Send + Sync manually.
This issue is also being tracked in the UCG repository. I opened an issue similar to this on Miri where I managed to satify it using a similar workaround to @steffahn's, but RalfJung did mention that it wasn't a valid way to implement self-referential types. So I don't think this is possible soundly (even if Miri does not complain).
FWIW, there's now a way to test this: the -Zmiri-track-raw-pointers flag. It will not work on code that use int-to-ptr casts, but for raw-ptr-only code it tracks provenance properly.
Of course that example code uses int-to-ptr casts precisely to get around the provenance tracking. This will have to be allowed somehow and it will be interesting to see what LLVM will do here...