Borrowing a cake and eating it too

It's easy and safe to borrow a thing (an i32 here) that someone (A) owns via something with a reference (B), like the share_i function below. But what if A doesn't want the thing back - it just passes on the whole thing, but not exactly moving, still returning a reference? Let's say a trait requires a reference there , or it's a choice made at runtime whether A will move out or borrow out (*).

Of course I tried to simply move the thing to an Option<i32> in B, and then borrow it, but that seems impossible: it's always borrowed for B's lifetime, so not available for business. I managed to write ship_i below, but I'm informed UnsafeCell is for internal mutability only. So (how) can it be done properly?

use std::cell::UnsafeCell;

struct A {
    own_i: Option<i32>,
}

struct B<'b> {
    ref_i: Option<&'b i32>,
    own_i: Option<UnsafeCell<i32>>,
}

impl A {
    fn share_i(&mut self) -> B<'_> {
        let i = self.own_i.as_mut().unwrap();
        B { ref_i: Some(i), own_i: None, }
    }

    fn ship_i(&mut self) -> B<'_> {
        let own_i = UnsafeCell::new(self.own_i.take().unwrap());
        let mut b = B { ref_i: None, own_i: Some(own_i) };
        b.ref_i = b.own_i.as_mut().map(|own_i| unsafe { &*own_i.get() });
        b
    }
}

fn main() {
    let mut a = A { own_i: Some(42) };
    {
        let b = a.share_i();
        assert_eq!(*b.ref_i.unwrap(), 42);
    }
    {
        let b = a.ship_i();
        assert_eq!(*b.ref_i.unwrap(), 42);
    }
}

(Playground)

(*) This seems like a genuine example to me, but maybe it's not. In reality, I'm trying to write leak amplification: B's drop handler will clean up the thing and place it back in in A, but if the caller thwarts that with a mem::forget(b), rather than leaving the dirty thing in A, we prefer it leaked.

Your ship_i is building a self-referential type in an invalid way, and only works because the function got inlined such that ref_i wasn't broken when the object was moved.

Perhaps you are looking for something like this?

use std::borrow::Cow;

struct A {
    own_i: Option<String>,
}

impl A {
    fn share_i(&mut self) -> Cow<'_, Option<String>> {
        Cow::Borrowed(&self.own_i)
    }

    fn ship_i(&mut self) -> Cow<'_, Option<String>> {
        Cow::Owned(self.own_i.take())
    }
}

fn main() {
    let mut a = A { own_i: Some("foo".to_string()) };
    {
        let b = a.share_i();
        assert_eq!(Option::as_ref(&*b).unwrap(), "foo");
    }
    {
        let b = a.ship_i();
        assert_eq!(Option::as_ref(&*b).unwrap(), "foo");
    }
}

I changed it to a string because integers are Copy, which can confuse the results.

2 Likes

I see now the example is quite broken. In the real code, the "thing" is a pointer itself, so that's why it seems to work in practice (and under Miri). I don't think Cow is what I'm looking for - it would be, for the 2nd toy reason I came up with, but the reference to an owned thing is essential. But "self-referential type" gives me plenty of text and code to look at.

Is something like this what you're looking for? Moving the thing out of A, but keeping a reference around so you can move it back when B is dropped?

Not really, I already have that part (moving the thing out and putting it back in the drop handler). What I struggle with is borrowing from the thing while it is in temporary storage, and storing that borrow in the same storage.

If that is all too abstract, A is a BTreeMap, B is its DrainFIilter, the "thing" is the root of the map's tree, and the reference borrowed from the thing is a handle to a location deep down in the tree under the root.

This requires some unsafe code to make work. In order to borrow the contents of the temporary storage, you also need to borrow the storage itself, so you’re ultimately trying to make the temporary storage hold a reference to itself. This is a problem because the original reference was borrowed from somewhere that needs it to be returned, and you now can’t do that without destroying the storage to get rid of the stored reference.

Lifetime information is stored in the type system, which deals in invariants. It can’t reason about things like conditionals that may or may not happen, but instead keeps track of things that are true regardless of which branch was taken.

If I understand your requirements correctly, this seems like a valid use-case for Rc<(Ref)Cell<T>> to me. You don't know at compile-time from where and when the value is dropped, i.e. you need a runtime alternative to the borrow checker and that's where Rc comes into play. UnsafeCell is an option over (Ref)Cell, but it requires unsafe for a good reason.

1 Like

I can't get any RefCell/Rc code to compile here, and in the end I don't understand how any safe code ever could. I'm hoping for some method on something that takes a &self or &mut self and returns a &'b mut Thing. The borrow checker is always going to stick its nose in and complain that this something remains borrowed, at the moment it's moved into B. The only way to avoid that, is to sneak below its radar with a raw pointer. All I was using in the example, is the fact that UnsafeCell::get does that. Together with the insight that the "thing" is behind another reference, and therefore not affected by moving B, that brings me to this playground example.

The title and original example don't make any sense to me anymore. You can't return a reference and what it references at the same time (perhaps you can store it somewhere, but not return and move it). You can't borrow cake and eat it too. I don't need to borrow the cake after moving it, I want to borrow the cake where it was originally and stays all the time, and move the holder of the cake back and forth. Perhaps I want to pin the cake.

If you click on "Tools", then on "Miri", you'll get an error, telling you your code contains undefined behavior.

Two important questions:

  1. Does the reference point to the "Thing" or something inside the "Thing" in your real application?
  2. If it's the latter, does it point to a sized or unsized (slices, trait objects) type?

If Miri didn't complain, I would still suspect there is undefined behavior. Or at least ugliness.

Does the reference point to the "Thing" or something inside the "Thing" in your real application?

Definitely not the former, maybe the latter. The "thing" is a btree::node::Root struct, the "reference" (stored as a NonNull wrapped in a handle) is at best (or worst) the same thing that the Root::node field (of type Unique) points to, and often to a node anywhere down the tree.

If it's the latter, does it point to a sized or unsized (slices, trait objects) type?

Sized, I would say. Though the contents of the NonNull alone don't tell you what size it is, but I suppose that doesn't matter.

Avoiding a cute but nonsensical story, but still talking about playground-sized code, I think my quest is to adapt this code:

  • Without changing the main function.
  • Without changing the typeof the holder field (much).
  • Without changing the type of the ref_thing field (much).
  • Avoiding undefined behavior (I'm not saying that it has or hasn't UB now, and neither does Miri).
  • Without using UnsafeCell, or anything that is meant for interior mutability.

Edit: sorry; I missed the assert is_none at the bottom of main, which leads to this solution instead, but I’m still not sure I’m solving your problem rather than working around your examples.


According to its type signature, ship_thing will transfer the lifetime of its self argument to B, so there’s no need for tmp_holder; A can’t be modified as long as the returned B exists: Playground.

That is what changes the game from the simple share_i function in the original example (which was broken because sharing a moved field is nonsense). The reason for changing the game is leak amplification (as in the footnote in the original example).

Your edited example dramatically changes the type of ref_string: it's not a reference at all, it's pure ownership, right? I don't think any change to ref_string's type is possible. Ultimately it's wrapped in a NodeRef struct deep down in code.

Feel free to contribute to the actual code in the github branch, but it took me about a year to understand anything about it.

The only way to make this sort of self-referential struct sound is to copy the lifetime from &self to string_ref before letting anything access it, and at that point you’ve re-invented Box— Ultimately, you’re going to need a type that implements Deref<Target=String> (or similar), so you can freely transform any reference to it into a reference to a string.

But as far as I understand now, B is not a self-referential struct, and could not be, because it will be moved. The reference (logically) refers to a thing that is (logically) still in A. We "just" embezzle A's owning reference to the same thing, sneak it into a tmp_holder in B, but we don't move the thing itself. It's a challenge to accomplish that, but it seems logically safe to me (note I cover myself by not saying logically possible…) As long as B is around, being the result of A::ship_thing, A is uniquely borrowed, so no one can spot its missing crown jewel. And if an annoying journalist asks what happens when mem::forget stops B's drop handler from putting the owning reference back into A (or if the drop handler is left out like in the example), then we change our story and say the thing was moved all along, but it doesn't matter because the only reference to the thing is in B and B is disappearing. What could possibly go wrong?

I just realized this restriction was unnecessary. The "owning reference" in A can be in any weird form. Currently it's essentially as Option<Unique<…>>.

PS Meaning that this variation is slightly closer to the goal.

I don't really understand why you need ref_thing when you already have tmp_holder, which contains it, but OK.

UnsafeCell is completely extraneous here. References coerce to pointers, and Box has into_raw; there's no reason to go through UnsafeCell just to get a raw pointer. So we can get rid of that.

Your first major soundness issue is that you have a Box<Thing>, which cannot be aliased, and a &mut Thing, which cannot be aliased, and they alias each other, even though the &mut Thing is not derived from the Box (in the stacked borrows model). So that's a problem. Again, in this example it's completely unnecessary to have both, but I will agree to overlook that. One way to fix the problem is to use raw pointers, which may be aliased. Only convert ref_thing to a reference when you need it, and only convert tmp_holder back to a Box in order to drop it, so you can be sure not to create an aliased &mut at any point.

(I think this is actually a little extra-conservative but I'm not 100% sure of the rules here)

Miri is OK with this, except the fact that it leaks memory. So you're going to have to add a Drop implementation for B. I think Drop::drop can be called in safe code, so you can't just convert the raw pointer unconditionally back into a Box. (Sorry, I don't do much with Drop, I forgot that one is special cased so I guess this isn't a problem.) Also at this point you'll presumably want to put the 'b lifetime back in to handle a backreference to A... Here's what I came up with. Miri still carps about the memory leakage but in this case it's obviously intentional. If you let b go out of scope instead of mem::forgetting it then it puts the Box back in a.

Adding any other pub methods to B runs a risk of accidentally introducing unsoundness (if I didn't make a mistake and do that already) so I'm going to stop here and let you poke holes in it if you want. What do you need to do that this doesn't let you do?

1 Like

I just had the inverse insight. I don't need any holder, because I can consider the ref_thing to be the owner. The reference is unavoidable because it appears (in disguise) in NodeRef. I'm not saying it's sound, but at least it compiles.

Leaking memory is okay here, it's the intention. The actual B does have a drop handler, but if that is disabled with mem::forget, we want to leak.

I failed to get anything past Miri using raw pointers, so applause for that. But I'm not sure a raw pointer can be used here. The reference is eventually stored as a (NonNull) raw pointer in NodeRef, but it takes a reference to get there and borrows like a reference. Poking a hole in there, when there isn't even a singular type to point to, aargh…

Back to the idea of considering the ref_thing to be the owner: it makes the code look too easy (well, easy at this point at least)… To show that it acts like the owner, the example needs to implement the leak-amplification-rollback drop handler. So adding in earlier code, neither compiler nor Miri seem to put up any more resistance and I end up with playground. But is it sound? I have a hard time believing that you can just consider a unique reference to have ownership. But I guess that's what we have been doing in C/C++ since always.