OwnRef: a smart pointer that owns the data it is referencing

kajacx · November 2, 2022, 10:36am

Can you give an example of how to use your OwnRef::new? Do you also use ManuallyDrop? Also, I have never used MaybeUninit before, can you please elaborate on what purpose is it fulfilling here?

alice · November 2, 2022, 10:41am

You use it as follows:

let mut storage = MaybeUninit::new(value);
// SAFETY: The storage is initialized.
let own_ref = unsafe { OwnRef::new(&mut storage) };

The MaybeUninit type is very similar to ManuallyDrop, but the key difference is that MaybeUninit makes it safe to create an invalid one and unsafe to access it, whereas ManuallyDrop makes it unsafe to create an invalid one, but safe to access it. I didn't want there being any chance that I might accidentally access the value inside the storage after the OwnRef goes out of scope, so I chose to use the one that makes it unsafe to access the value inside it.

jbe · November 2, 2022, 10:48am

Regarding AsRef, you might want to read this (it's a longer thread):

Or consider this advice.

kajacx · November 2, 2022, 10:49am

I think it does. Not in the original playground link (I didn't know that std::ptr::drop_in_place existed) where I copied the value into the drop function:

fn drop(&mut self) {
    let _drop = unsafe { std::ptr::read(self.pointer.as_ptr()) };
}

But with the drop_in_place optimization, this absolutely happens:

fn drop(&mut self) {
    unsafe {
        drop_in_place(self.pointer.as_ptr());
    }
}

Here is an example of how the &T shared reference gets turned into a &mut T mutable reference when it the OwnRef is dropped.

And this doesn't really have anything to do with unsafe pointer, similar behaviour can be reproduced without any unsafe: Rust Playground

Since you can obtain a &mut reference from immutable data in safe code using the drop implementation, I assume it is OK to do the same in unsafe code, but I wanted to check to be sure.

kajacx · November 2, 2022, 11:35am

Yea, I thought about using StackBox for the name. I should have googled that. It seems to be what I was looking for.

That does look interesting, but it probably does a lot of other stuff that I am not necessarily interested in.

Is that relevant to the pinning or do I need the drop flag regardless? Is so, why? I can imagine using a newtype for &mut Option<T> where the dropping is done by the original value holder and not the OwnRef by checking at runtime if the value has been taken or not. Simply doing nothing will achieve this, since that is the drop implementation of Option.

On the other hand, I can also imagine the OwnRef dropping the value when it gets dropped, which should also work, and it doesn't require the runtime check.

That is very true, thanks for providing a counter example. Re-assigning the original variable while an OwnRef still exists is a disaster that I will have to think about getting around.

H2CO3 · November 2, 2022, 11:38am

No, that's totally wrong, on at least 2 counts.

First, you seem to be using an implementation detail (the address a reference happens to point to) for justifying what should be a matter of the language specification. That doesn't work (it's backwards).

Second, in the playground you linked to, nothing is turning an immutable reference into a mutable reference. That's exactly what I'm trying to point out: it's absolutely, positively, not allowed to turn a &T into a &mut T, it's UB, and you should never do it. And it's not what is happening in your code, either – there is no immutable reference in sight.

What you have instead is an immutable binding, which isn't a reference. And since it isn't a reference but a value, it owns the contained value, so you might have as well rebound it like let mut new_value = old_value.

And that's exactly what happens semantically in drop(): you disposed of the value, so it was moved (by the compiler), and at point wherever it was moved can do whatever it wants with it.

That's not what happens there, either. NonNull and drop_in_place() operate over raw pointers, not references. But I'm not actually sure if the original ude is safe – certainly, it would be more comforting to use a &mut T and a mutable binding in the first place for creating the pointer (since raw pointers have provenance, too).

kajacx · November 2, 2022, 11:53am

I have no idea what you mean. How is showing that 2 references point to the same address to prove that is it the same reference an implementation detail?

Yes, it is. HasData::new get an immutable reference. That immutable reference is then stored in a NonNull, that then gets turned into a raw pointer (*mut), and that then gets turned into a mutable reference. It's still the same reference the whole time. It still points to the same data.

Yes, there is. HasData::new Takes an immutable referene in the latest playground example, and OwnRef::new takes an immutable reference in the very code you provided in your first post.

Yes, but what if that raw pointer was constructed from a reference? Again, the page on NonNull says:

Notice that NonNull<T> has a From instance for &T . However, this does not change the fact that mutating through a (pointer derived from a) shared reference is undefined behavior unless the mutation happens inside an UnsafeCell<T>. The same goes for creating a mutable reference from a shared reference. When using this From instance without an UnsafeCell<T> , it is your responsibility to ensure that as_mut is never called, and as_ptr is never used for mutation.

2e71828 · November 2, 2022, 12:29pm

Let me see if I understand the situation here:

You start with an &DropMe passed to HasData::new()
You then convert that into a NotNull<DropMe>, which has a documented condition that "it is your responsibility to ensure that ... as_ptr() is never used for mutation."
Inside HasData::drop() you call as_ptr(), which you've previously agreed to not use for mutation
You pass the resulting pointer to drop_in_place, which converts it to a mutable reference on your behalf in order to call DropMe::drop()
DropMe::drop() then uses that mutable reference to make an actual mutation

The result from as_ptr() has pretty clearly been used here to mutate the referent in violation of the restriction invoked at step 2.

With regard to your no-unsafe playground example: The mutable reference in drop() is only created after all of the existing uses have completed, so it really does have exclusive access to the data.

H2CO3 · November 2, 2022, 12:36pm

Because it's dealing with addresses resulting from happenstance, and not with abstract entities (references, etc.) with semantics as specified by the constraints of the language.

There is no mention of any type called HasData or NonNull in the linked playground. It defines a type called DropMe that never stores any references.

kajacx · November 2, 2022, 12:36pm

Yes, exactly. That's what I have been trying to explain this whole time, but @H2CO3 keeps saying that it's not what is happening in your code.

Hang on. I was under the impression that you can NEVER convert an & into an &mut, even when you can guarantee that there are no (other) uses for the duration for the &mut.

kajacx · November 2, 2022, 12:38pm

Yes there is, I posted 2 playground link. Please read a post in it's entirety next time to avoid confusion.

2e71828 · November 2, 2022, 12:41pm

That's correct, but that code isn't converting an & into an &mut; it's constructing a new &mut from the backing value after the previous & doesn't exist anymore. They refer to the same place in memory, but for disjoint spans of time and neither is generated from the other.

H2CO3 · November 2, 2022, 12:42pm

Okay, but I'm not talking about that one. I'm talking about this one. Before this link, you wrote:

This is what I reacted to, and my point here is that this is not the same situation as in your OwnRef.

As for the other playground, in which you mutate the referent through a pointer derived from an immutable reference: that's clearly undefined behavior, just run the code under Miri:

  Compiling playground v0.0.1 (/playground)
    Finished dev [unoptimized + debuginfo] target(s) in 0.62s
     Running `/playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/bin/cargo-miri runner target/miri/x86_64-unknown-linux-gnu/debug/playground`
error: Undefined Behavior: trying to retag from <8385> for SharedReadWrite permission at alloc1519[0x0], but that tag only grants SharedReadOnly permission for this location
   --> /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:490:1
    |
490 | pub unsafe fn drop_in_place<T: ?Sized>(to_drop: *mut T) {
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | |
    | trying to retag from <8385> for SharedReadWrite permission at alloc1519[0x0], but that tag only grants SharedReadOnly permission for this location
    | this error occurs as part of retag at alloc1519[0x0..0x4]
    |
    = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
    = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <8385> was created by a SharedReadOnly retag at offsets [0x0..0x4]
   --> src/main.rs:43:19
    |
43  |             data: NonNull::from(data),
    |                   ^^^^^^^^^^^^^^^^^^^
    = note: BACKTRACE:
    = note: inside `std::ptr::drop_in_place::<DropMe> - shim(Some(DropMe))` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:490:1
note: inside `<HasData as std::ops::Drop>::drop` at src/main.rs:57:13
   --> src/main.rs:57:13
    |
57  |             std::ptr::drop_in_place(self.data.as_ptr());
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    = note: inside `std::ptr::drop_in_place::<HasData> - shim(Some(HasData))` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:490:1
    = note: inside `std::mem::drop::<HasData>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/mem/mod.rs:987:24
note: inside `main` at src/main.rs:73:5
   --> src/main.rs:73:5
    |
73  |     std::mem::drop(data); // explicit drop just to not get a "not used" warning
    |     ^^^^^^^^^^^^^^^^^^^^

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

error: aborting due to previous error

kajacx · November 2, 2022, 12:48pm

Wait, so you can convert a & into an &mut, as long as the original & (and all other references) die before the &mut starts existing? (For example, by storing the value in a raw pointer). What if the original data is not declared a mutable? Would that still be OK?

2e71828 · November 2, 2022, 12:53pm

No, you can't. Because of the way pointer provenance is tracked, the & you used to create the raw pointer continues to exist until after the last dereference of that pointer. The only legal way to create an &mut in that situation is to create it directly from the owned value¹, in this case the local variable called drop.

¹ Technically, if the shared reference was originally created by reborrowing a mutable reference, you could reborrow that mutable reference again instead of going all the way back to the owner. But you still can't do that legally until the & is well and truly gone.

kajacx · November 2, 2022, 2:40pm

Ok, that sounds interesting. By continues to exist, you mean after it goes out of scope / is dropped? For example, here: Rust Playground

fn main() {
    let a = vec![1, 2, 3];
    
    let mut as_ptr = NonNull::from(&a);
    
    unsafe { as_ptr.as_mut().push(4) };
    
    println!("{a:?}");
}

The &a reference is never used after the NonNull is constructed. Does it then still continue to exist somehow? Or is just about the fact that the NonNull that was constructed from that & reference continues to exist?

I think this might have something to do with a problem that I'm having now:

I tried to implemented that in my original code, but some of my tests are failing when ran with miri. Here is the same OwnRef from that Rust Playground link, but I changed only the main method.

This usage fails with miri. And since main isn't using anything unsafe, the problem has to be in OwnRef. This test was working when I was not using ManuallyDrop, although I do not know why.

alice · November 2, 2022, 3:11pm

It's not that the reference continues to exists, its just that the raw pointer has the same permissions as the reference you created it from, and the reference does not have permission to write.

riking · November 2, 2022, 3:55pm

I think "continues to exist" is a usable mental model, because that raw pointer's permission can be downgraded (to Invalid) by conflicting uses.

H2CO3 · November 2, 2022, 5:19pm

Just as predicted, the error goes away if you start using mutable references. I'm not sure why this happened to work with the previous, mem::forget()-based implementation. (Note that it might only be a false negative – Miri has false negatives, it can't detect all UB unconditionally.)

CAD97 · November 2, 2022, 8:07pm

That implementation did a ptr::read to make a copy of the value on the stack to drop, which is not itself UB; it's valid to read through a reference, after all.

@kajacx, rather than use raw pointers, just use &mut ManuallyDrop! This is the simple implementation with ManuallyDrop that you could use: [playground]

pub struct RefOwn<'a, T> {
    value: &'a mut ManuallyDrop<T>,
}

impl<T> Drop for RefOwn<'_, T> {
    fn drop(&mut self) {
        // SAFETY: self is in charge of dropping the ManuallyDrop.
        unsafe { ManuallyDrop::drop(&mut self.value) };
    }
}

impl<'a, T> RefOwn<'a, T> {
    /// Create a new owning reference which will drop the referee.
    ///
    /// # Safety
    ///
    /// Like [`ManuallyDrop::take`] and [`ManuallyDrop::drop`], this function
    /// semantically moves the value out of the `ManuallyDrop` container. As
    /// such, the borrowed `ManuallyDrop` must not be used again after calling
    /// this function.
    pub unsafe fn new(value: &'a mut ManuallyDrop<T>) -> Self {
        Self { value }
    }
    
    pub fn get(this: &Self) -> &T { &this.value }
    pub fn get_mut(this: &mut Self) -> &mut T { &mut this.value }
    
    pub fn into_inner(this: Self) -> T {
        let mut this = ManuallyDrop::new(this);
        // SAFETY: this.value is not used (even to drop) after taking its value.
        unsafe { ManuallyDrop::take(&mut this.value) }
    }
}

macro_rules! ref_own {
    (mut $x:ident) => { $crate::ref_own! { let mut $x = &move $x; } };
    ($x:ident) => { $crate::ref_own! { let $x = &move $x; } };

    (let mut $r:ident = &move $val:expr $(;)?) => {
        let mut $r = ::core::mem::ManuallyDrop::new($val);
        // SAFETY: Because the ManuallyDrop binding is shadowed,
        // it is impossible to use after constructing the RefOwn.
        #[allow(unsafe_code, unused_mut)]
        let mut $r = unsafe { $crate::RefOwn::new(&mut $r) };
    };
    (let $r:ident = &move $val:expr $(;)?) => {
        $crate::ref_own! { let mut $r = &move $val; }
        // Rebind to remove the mut if it's not requested.
        // (It's needed for DerefMut, though.)
        let $r = $r;
    };
}

other implementations

The issue with this implementation is the variance. To illustrate:

// compiles:
fn shorten_ref<'short, 'long: 'short>(r: Box<&'long i32>) -> Box<&'short i32> {
    r
}

// compile error:
fn shorten_ref_own<'this, 'short, 'long: 'short>(
    r: RefOwn<'this, &'long i32>,
) -> RefOwn<'this, &'short i32> {
    r
}

error: lifetime may not live long enough
  --> src/main.rs:24:5
   |
21 | fn shorten_ref_own<'this, 'short, 'long: 'short>(
   |                           ------  ----- lifetime `'long` defined here
   |                           |
   |                           lifetime `'short` defined here
...
24 |     r
   |     ^ function was supposed to return data with lifetime `'long` but it is returning data with lifetime `'short`
   |
   = help: consider adding the following bound: `'short: 'long`
   = note: requirement occurs because of the type `RefOwn<'_, &i32>`, which makes the generic argument `&i32` invariant
   = note: the struct `RefOwn<'a, T>` is invariant over the parameter `T`
   = help: see <https://doc.rust-lang.org/nomicon/subtyping.html> for more information about variance

As the error message says, RefOwn<T> is here invariant over T. In short, this means it's not possible to coerce lifetimes in T to shorter lifetimes, despite this being sound. To make RefOwn<T> covariant over T like Box, we need to use NonNull<T> (which is covariant over T) instead of &mut T (which is invariant over T). I still use NonNull<ManuallyDrop<T>> rather than NonNull<T> because I think that makes for clearer code, but you could use just NonNull<T> if you really wanted to, so long as the host place is still ManuallyDrop. [playground]

 pub struct RefOwn<'a, T> {
-    value: &'a mut ManuallyDrop<T>,
+    value: NonNull<ManuallyDrop<T>>,
+    marker: PhantomData<&'a T>,
 }

 impl<T> Drop for RefOwn<'_, T> {
     fn drop(&mut self) {
         // SAFETY: self is in charge of dropping the ManuallyDrop.
-        unsafe { ManuallyDrop::drop(&mut self.value) };
+        unsafe { ManuallyDrop::drop(self.value.as_mut()) };
     }
 }
 
 impl<'a, T> RefOwn<'a, T> {
     /// Create a new owning reference which will drop the referee.
     ///
     /// # Safety
     ///
     /// Like [`ManuallyDrop::take`] and [`ManuallyDrop::drop`], this function
     /// semantically moves the value out of the `ManuallyDrop` container. As
     /// such, the borrowed `ManuallyDrop` must not be used again after calling
     /// this function.
     pub unsafe fn new(value: &'a mut ManuallyDrop<T>) -> Self {
-        Self { value }
+        Self {
+            value: NonNull::from(value),
+            marker: PhantomData,
+        }
     }
     
-    pub fn get(this: &Self) -> &T { &this.value }
-    pub fn get_mut(this: &mut Self) -> &mut T { &mut this.value }
+    pub fn get(this: &Self) -> &T {
+        // SAFETY: this.value is a valid pointer
+        &*unsafe { this.value.as_ref() }
+    }
+
+    pub fn get_mut(this: &mut Self) -> &mut T {
+        // SAFETY: this.value is a valid pointer
+        &mut *unsafe { this.value.as_mut() }
+    }
     
     pub fn into_inner(this: Self) -> T {
         let mut this = ManuallyDrop::new(this);
         // SAFETY: this.value is not used (even to drop) after taking its value.
-        unsafe { ManuallyDrop::take(&mut this.value) }
+        unsafe { ManuallyDrop::take(this.value.as_mut()) }
     }
 }

This still isn't sufficient to pin the RefOwned value, though. The RefOwn can be forgotten, and then the borrowed value's place will be freed without dropping it, breaking the core invariant of Pin. Instead we need to allow the stack owner to drop the value if we forgot to — and this brings us to what is both the most well-featured RefOwn and the one writable with the least unsafe: [playground]

pub struct RefOwn<'a, T> {
    value: &'a mut Option<T>,
}

impl<T> Drop for RefOwn<'_, T> {
    fn drop(&mut self) {
        self.value.take();
    }
}

impl<'a, T> RefOwn<'a, T> {
    /// Create a new owning reference which will drop the referee.
    ///
    /// # Safety
    ///
    /// The borrowed value must not be used after creating the `RefOwn`,
    /// except that it must be dropped in place before it is freed or reused,
    /// in order to satisfy the pinning requirements.
    ///
    /// If [`into_pin`] were not provided, this function could be safe.
    pub unsafe fn new(value: &'a mut Option<T>) -> Self {
        debug_assert!(matches!(value, Some(_)));        
        Self {
            value,
        }
    }

    pub fn get(this: &Self) -> &T {
        this.value.as_ref().unwrap()
    }

    pub fn get_mut(this: &mut Self) -> &mut T {
        this.value.as_mut().unwrap()
    }

    pub fn into_inner(this: Self) -> T {
        this.value.take().unwrap()
    }
    
    pub fn into_pin(this: Self) -> Pin<Self> {
        // SAFETY: we own the pointee and `new` guarantees that it is pinned.
        unsafe { Pin::new_unchecked(this) }
    }
}

(Of course you could apply the same &mut->NonNull transform to this to get covariance, if you wish, as well as use unreachable_unchecked to assume the Option is Some to remove some dynamic checks.)

If you want anything more than this, then I would highly recommend just using the moveit crate rather than rolling your own. You can see its documentation and the associated blog post (and prequel) for more information about drop flags and how moveit handles them, as well as the different "modes" it offers.

OP questions

Should I implement `AsRef`?

std implements AsRef<T> for Box<T>. RefOwn acts like a Box, so implementing AsRef<T> for RefOwn<T> would make sense, if you want RefOwn to act like Box. Alternatively, the "more correct" implementation would probably be AsRef<U> for RefOwn<T> where T: AsRef<U>; this implementation is called out in the documentation as of 1.66 (releasing tomorrow, 2022-11-03).

Should I implement `Borrow`?

Absolutely. If we check std, we see Borrow<T> for Box<T>, so it makes sense to mirror that. Again, it'd maybe be "more correct" to provide the transitive Borrow<U> for RefOwn<T> where T: Borrow<U>, but in this case it's not possible to do so, as this conflicts with the reflexive Borrow<T> for T. (It is possible to have RefOwn<T> where T: Borrow<RefOwn<T>>.)

The same goes for BorrowMut now that we're using an implementation that can actually mutate the indirectly owned value.

Is mutating an immutable value in the destructor safe?

It is 100% never allowed to mutate through a shared reference &T nor any pointers derived from it. Values themselves are not "mutable" or "immutable;" this is only a property of the binding (let vs let mut) or the reference (& vs &mut).

Is storing `NonNull<T>` better than `const T` or `mut T`?

In the case when the pointer is known not null, then NonNull is typically the best type to use "at rest" in structures. The one caveat is that you need to think about variance — but if you're the owner of the value, NonNull<T> has the correct variance. And generally, you can use PhantomData to "by example" the variance — keep a PhantomData<Box<T>>, PhantomData<&T>, or PhantomData<&mut T> depending on what pointer type you're acting like. If you're borrowing without a lifetime, learn about variance.

What other traits should I implement?

Anything you see implemented for Box is usually reasonable game. Anything #[derive]able is certainly expected.

Is the code safe?

Terminology — if the code uses unsafe, it is not safe by definition. An API is sound if it cannot be used to cause UB if all of the unsafe preconditions are satisfied. The implementations of RefOwn I provide in this post should be sound.

What about `lifetime_of`?

but then the data doesn't die after OwnRef dies

Stack variable bindings are dropped and deallocated when they go out of scope. lifetime_of in the original implementation doesn't impact anything about when data goes out of scope.

compiler will optimize it into a no-op

Note that if it's neither generic nor marked #[inline], as of current rustc a function will always be called when used from a different crate. The original playground doesn't even use lifetime_of, though, as it stores &'a () rather than PhantomData<&'a T>; this definitely will have a size impact, as the &() has to be stored.

If you have specific follow-up questions, feel free to ask them of course, but I'd suggest making a new thread for them — the questions posed by the OP have been answered, so drifting the topic of the thread further from that only serves to muddy and obfuscate the points under discussion. New threads are nearly^[1] a free resource, and very much help to frame a question with the context you want. Feel free to ping me if you make such a new thread.

Of course the effort to properly reestablish context isn't free. But collecting your thoughts enough to provide that context is often also itself helpful towards finding a solution to your question(s) — there's a reason rubber duck debugging is a well-known term. ↩︎

Topic		Replies	Views
Owning pointers: could they exist/ be helpful?	9	1148	August 31, 2020
Carrying owner and struct borrows it together code review	25	1017	July 11, 2022
`borrowed_with_owner`, an alternative to `owning_ref` announcements	12	901	August 21, 2022
"Self-borrowing" from owned data in the same Struct help	9	1178	May 25, 2021
Struct that owns an object that needs a reference to the struct help	4	1457	May 29, 2022

OwnRef: a smart pointer that owns the data it is referencing

Should I implement AsRef?

Should I implement Borrow?