Pattern for nested mutable references

Hi!

I want to create a data structure that looks kind of like this:

pub struct NestedRef<'a>(Option<&'a mut NestedRef<'a>>);

#[test]
fn test_scoped_ref() {
    let mut a = NestedRef(None);
    {
        NestedRef(Some(&mut a));
    }
    NestedRef(Some(&mut a));
}

Unfortunately, that doesn't compile because of a borrow conflict. I think the issue is that I'm using the same lifetime for everything, but I actually want the inner-nested lifetimes to be shorter.

Okay, can I just give it two lifetimes? Well technically that does solve the case above but it only allows for a single level of nesting. I'd like to have limitless nesting, but the code below doesn't compile:

pub struct NestedRef2<'a, 'b>(Option<&'a mut NestedRef2<'b, 'b>>);

#[test]
fn test_nested_ref_2() {
    let mut a = NestedRef2(None);
    {
        let mut b = NestedRef2(Some(&mut a));
        {
            NestedRef2(Some(&mut b));
        }
        NestedRef2(Some(&mut b));
    }
    NestedRef2(Some(&mut a));
}

My intuition is that the approach above would be a dead end because I think a new lifetime parameter is needed for every additional level of nesting. Is there a way around this?

I think using a trait accomplishes what I want. Is there a simpler way, though?

pub trait Nesty {}

// Kind of crazy base case
impl Nesty for () {}

pub struct NestedRef3<'a, N: Nesty>(Option<&'a mut N>);

impl<'a, N: Nesty> Nesty for NestedRef3<'a, N> {}

#[test]
fn test_nested_ref_3() {
    let mut a = NestedRef3::<'_, ()>(None);
    {
        let mut b = NestedRef3(Some(&mut a));
        {
            NestedRef3(Some(&mut b));
        }
        NestedRef3(Some(&mut b));
    }
    NestedRef3(Some(&mut a));
}

Thanks for your help!

Similar to the trait solution but a little cleaner:

pub struct NestedRef<'a, T>(Option<&'a mut T>);

impl NestedRef<'_, ()> {
    fn new() -> Self {
        Self(None)
    }
}

#[test]
fn test_scoped_ref() {
    let mut a = NestedRef::new();
    {
        NestedRef(Some(&mut a));
    }
    NestedRef(Some(&mut a));
}

The new function is just for convenience, you could create it with NestedRef::<()>(None) as well. If there's a better solution without generics I'd be interested to see it as well, but if not I think this may be something that'll be addressed by the next gen borrow checker that's in development.

1 Like

Thanks, that's a really cool solution! Is it possible to add to it a function that recursively accesses the contents of the NestedRef?

I got stuck here:

impl<T> NestedRef<'_, T> {
    fn level(&mut self) -> usize {
        if let Some(inner) = self.0 {
            return inner.level() + 1
        } else {
            return 0
        }
    }
}

...but this won't do it because inner.level() doesn't exist, and I'm not sure how to make it exist without a trait.

I don't think a recursive function is possible, but if you're ok with storing the value inside the struct you can do this

pub struct NestedRef<'a, T> {
    val: Option<&'a mut T>,
    depth: usize,
}

impl<'a, T> NestedRef<'a, T> {
    fn new(t: &'a mut T) -> Self {
        Self {
            val: Some(t),
            depth: 0,
        }
    }
}

impl<'a, 'b, T> NestedRef<'a, NestedRef<'b, T>> {
    fn nest(val: &'a mut NestedRef<'b, T>) -> Self {
        NestedRef {
            depth: val.depth + 1,
            val: Some(val),
        }
    }
}

#[test]
fn test_scoped_ref() {
    let mut x = 123;
    let mut val = NestedRef::new(&mut x);
    assert_eq!(val.depth, 0);
    {
        let mut val = NestedRef::nest(&mut val);
        assert_eq!(val.depth, 1);
        {
            let val = NestedRef::nest(&mut val);
            assert_eq!(val.depth, 2);
        }
    }
    NestedRef::nest(&mut val);
}
1 Like

@occupy_paul_st Could you clarify the purpose here? It would make it much easier to come up with the right solution. The solutions so far proposed by @Heliozoa address the nested lifetime aspect of it, but also mean that one can't declare a variable:

let a: NestedRef<'a, ???>;

that can store an arbitrarily nested NestedRef since the specific level of nesting depth is part of the type itself.

Now, unfortunately, I don't fully understand why your original version doesn't work, and I wonder whether Polonius would help, but data structures of nested mutable borrows offer a veritable challenge to once's sanity. Depending on what precisely you need, you can get around the need for mutable references altogether:

  • If you don't really need to borrow, but you can own, then you would get a linked list:
    struct Nested(Option<Box<Nested>>);
    
    and can mutate things to your heart's content.
  • However, if you require borrowing for whichever reason, then you could use interior mutability with RefCell and use shared references instead:
    struct NestedRef<'a>(RefCell<Option<&'a NestedRef<'a>>>);
    
    (Playground with sample list traversal and editing.)
    RefCell comes with a run-time check on borrow()s (and is thus not free) and will panic if borrowing from the RefCell mutably while immutable borrows are alive. But it gives far more flexibility than dealing with nested mutable (re-)borrows all over the place (note that the excellent Too Many Linked Lists book doesn't deal with a chain of nested borrows like what you requested).
2 Likes

@Heliozoa that's a really cool solution! I'm not yet sure how to apply it to my problem, but it is really opening my eyes to generic programming without traits! :blush:

@marcianx yeah, that's a good point, it's probably hard to propose a good solution without understanding the greater context of what I'm working on.

I think there's a good question to whether I actually need to borrow here, or whether I could take ownership. I don't know yet! I think I'm reluctant to use Box and RefCell because I want to see how far I can get without adding too much performance overhead, but maybe they will end up being the best choice in the end.

Basically what I'm trying to do is experiment with "scoped operations" that can modify a data structure in place, but undo the modification when they go out of scope. I've tried a few different structures for the code, but here is my most recent:

pub trait VecScoped: Sized {
    type Element;

    // TODO this should only be accessible from within the trait, we don't want to give users access
    // to modify the vec because they might break an invariant
    fn vec_mut(&mut self) -> &mut Vec<Self::Element>;

    fn pushed(&mut self, value: Self::Element) -> Push<Self> {
        self.vec_mut().push(value);
        Push(self)
    }
}

impl<T> VecScoped for Vec<T> {
    type Element = T;

    fn vec_mut(&mut self) -> &mut Vec<Self::Element> {
        self
    }
}

pub struct Push<'a, V: VecScoped>(&'a mut V);

impl<'a, V: VecScoped> Drop for Push<'a, V> {
    fn drop(&mut self) {
        debug_assert!(
            self.0.vec_mut().pop().is_some(),
            "Someone has illicitly popped an element!"
        );
    }
}

impl<'a, V: VecScoped> VecScoped for Push<'a, V> {
    type Element = V::Element;

    fn vec_mut(&mut self) -> &mut Vec<Self::Element> {
        self.0.vec_mut()
    }
}

#[test]
fn test_scoped_vec() {
    let mut a = vec![1];
    {
        let mut b = a.pushed(2);
        {
            assert_eq!(&[1, 2, 3], b.pushed(3).vec_mut().as_slice());
        }
        assert_eq!(&[1, 2, -3], b.pushed(-3).vec_mut().as_slice());
    }
    assert_eq!(&[1, -2], a.pushed(-2).vec_mut().as_slice());
    assert_eq!(&[1], a.vec_mut().as_slice());
}

I see, thanks for that context!

RE performance, while Box requires a heap allocation, RefCell is stored in-place. The main performance cost is checking the value of one (borrow()) or two (borrom_mut() or other mutations) integers followed by an increment and a decrement when the borrow ends. The integer-checks should be reliably branch-predictable.

RE your strategy above, a few things come to mind:

  1. You might want to add #[must_use] on Push. Otherwise, a line like:

    a.pushed(2);
    

    is a noop since the returned Push is immediately Dropped.

  2. Technically, Rust cannot guarantee that Drop runs⁠—i.e. the user of the API could move Push into a heap allocation and store it. So as long as the user of this does something "reasonable", you should be fine.

  3. Note that since debug_assert! is stripped out in release mode, the following would be more appropriate:

    fn drop(&mut self) {
        let _val = self.0.vec_mut().pop();
        debug_assert!(
            _val.is_some(),
            "Someone has illicitly popped an element!"
        );
    }
    

A different approach might be to make a proxy object that itself tracks the Vec to be modified and separately, the operations applied to the Vec. that allows the user of the object to precisely control whether to commit or undo operations, undoing them by default on drop().

enum ScopedOps<E> {
    Push
}

pub struct ScopedVecUpdater<'a, E> {
  v: &'a mut Vec<E>,
  ops: Vec<ScopedOps<E>>,
}

impl <'a, E> ScopedVecUpdater<'a, E> {
    pub fn pushed(&mut self, val: E) {
        self.v.push(val);
        ops.push(ScopedOps::Push);
    }

    pub fn revert(&mut self) { // Also call this on `Drop`.
        while Some(op) = self.ops.pop() {
            match (op) {
                Push => self.v.pop();
            }
        }      
    }

    pub fn commit(&mut self) {
        self.ops.clear();
    }
}

Another alternative if you want to take @Heliozoa's approach is to make all operations be self-consuming and produce a new ScopedVecUpdater with a different type every time. This avoids the single Vec heap allocation if that's a concern, and making this a parallel list avoids the whole nested mut business. While this would be considered a "zero-cost" abstraction, anything that uses type-level chaining like this is also a great way to generate code bloat. :stuck_out_tongue_winking_eye: E.g. (warning: not tested; typo-landmine):

trait Undo {
    fn undo(&self, &mut vec);
}

impl Undo for () {
    fn undo(&self, &mut vec) {}
}

struct Push<U>(U);

impl<U: Undo> Undo for Push<U> {
    fn undo(&self, &mut vec) {
        vec.pop();
        self.0.undo(vec);
    }
}

pub struct ScopedVecUpdater<'a, E, U> {
  v: &'a mut Vec<E>,
  ops: U,
}

// IMPORTANT: All of these must _consume_ `self`.
impl <'a, E, U: Undo> ScopedVecUpdater<'a, E, U> {
    pub fn new(&v: Vec) -> ScopedVecUpdater<'_, E, ()> {
        ScopedVecUpdater(v, ())
    }

    pub fn pushed(self, val: E) -> ScopedVecUpdater<'a, E, Push<U>> {
        self.v.push(val);
        ScopedVecUpdater(v, Push(self.ops))
    }

    pub fn revert(self) {
        drop(self);
    }

    pub fn commit(self) {
        std::mem::forget(self); // Don't call `Drop`.
    }
}

impl <'a, E, U: Undo> Drop for ScopedVecUpdater<'a, E, U> {
   fn drop(&mut self) {
      self.ops.undo(self.v);
   }
}

(Edit: Compilation errors above. See my follow-up post below for the complete solution tested on the playground.)

You would use it like so:

let v = vec![1, 2];
{
  let scoped = ScopedVecUpdater(v);
  let scoped = scoped.push(3);
  let scoped = scoped.push(4); // v is [1, 2, 3, 4]
} // v is [1, 2] 

Ah, that's helpful! I suppose that checking a RefCell is probably not much of an expense if we're pushing an element onto a Vec anyways.

Yeah, that seems like a good idea! I can't think of a reason (besides) testing that you would ever want to not use the struct.

I think that's fine, because the user won't be able to access the underlying Vec until Drop runs anyways.

Oh man, that's a great bug! I didn't even think of that!

That is a neat approach; I hadn't thought of letting users choose "commit vs. undo." I would guess that storing the list of operations in a Vec would make the code more versatile, since it's not as reliant on Rust scopes.

Yeah, I guess my current approach would also produce a different type each time, which might be too bloaty.

I might suggest a couple changes to the self-consuming example:

  1. Calling revert would consume self, destroying the inner value. I think this means you would have to undo all the operations at once; to be able to undo an individual operation you might need like an into_inner.
  2. Calling std::mem::forget on self would prevent drop from being called, but wouldn't it also leak memory? I'm thinking that an alternative might be to store a should_revert boolean as part of the struct or something, then check that boolean in the Drop fn.

Thanks for all the suggestions!!

Ah, crud, you're right. In fact, I just realized that push() is also incorrect since we cannot move self.v and self.ops out and we still have to prevent self's destructor from running. No runtime flag is needed. Effectively, we need this helper function to pull out the fields from ScopedVecUpdater without triggering its destructor. This requires unsafe, unfortunately:

fn forget_into_parts(self) -> (&'a mut Vec<E>, U) {
    let mut v = MaybeUninit::<&'a mut Vec<E>>::uninit();
    let mut ops = MaybeUninit::<U>::uninit();
    unsafe {
        copy_nonoverlapping(&self.v, v.as_mut_ptr(), size_of::<&mut Vec<E>>());
        copy_nonoverlapping(&self.ops, ops.as_mut_ptr(), size_of::<U>());
    }
    forget(self); // Must be done before assume_init to avoid UB.
    unsafe { (v.assume_init(), ops.assume_init()) }
}

Then commit becomes:

pub fn commit(self) {
    // `Drop`s fields without reverting the changes.
    self.forget_into_parts();
}

You may ask "Why pull out the &mut Vec? It doesn't require dropping." Well, see below. :smiley:

To address your first point, we can add an undo_last which takes self by value, does only one level of undo, and unwraps the passed-in Undo by one level. (Note that Undo::undo must still take self by reference since it's called in ScopedVecUpdater's Drop, which receives &mut self and thus cannot move self.ops out.)

impl<U: Undo> Undo for Push<U> {
    type Undone = U;
    // ...
    fn undo_last<E>(self, vec: &mut Vec<E>) -> Self::Undone {
        vec.pop();
        self.0
    }
}
// ...
impl <'a, E, U: Undo> ScopedVecUpdater<'a, E, U> {
    // ...
    pub fn revert_last(self) -> ScopedVecUpdater<'a, E, U::Undone> {
        let (v, ops) = self.forget_into_parts();
        let ops = ops.undo_last(v);
        ScopedVecUpdater { v, ops }
    }
    // ...
}

And to be sure it works, here it is running on the playground.

So yea, I'm much more of a fan of the other solution with a heap-allocating parallel Vec of operations :stuck_out_tongue:. You can avoid heap allocations for a small number of operations by using SmallVec/TinyVec. Heck, if you have an upper bound on the number of operations, you could even use ArrayVec.

Are you sure you want to use temporary exclusive references here, instead of merely mutably referencing the objects?

&mut is inseparable from the "temporary" and "exclusive" aspects. If you want to reference something and be able to mutate it, but not strictly tie it to a scope of a variable, then Arc<Mutex<Nested>> is the type to do it.

Yeah, the conjunction of Drop with methods that take self by ownership is pretty gnarly!!

As a possible alternative, if we take ownership of the underlying Vec, it's possible to implement commit/undo functionality without any unsafe code, because we don't need to implement Drop. It's also, to me at least, a bit simpler:

pub trait VecScoped: Sized {
    type Element;

    // TODO this should only be accessible from within the trait, we don't want to give users access
    // to modify the vec because they might break an invariant
    fn vec_mut(&mut self) -> &mut Vec<Self::Element>;

    // TODO add a way to turn this into a &Vec. Could Deref do the trick?

    fn pushed(mut self, value: Self::Element) -> Push<Self> {
        self.vec_mut().push(value);
        Push(self)
    }
}

impl<T> VecScoped for Vec<T> {
    type Element = T;

    fn vec_mut(&mut self) -> &mut Vec<Self::Element> {
        self
    }
}

/// TODO this should probably be must_use
pub struct Push<V: VecScoped>(V);

impl<V: VecScoped> Push<V> {
    /// In order to implement both into_inner and Drop, we'll need some sketchy trickery
    /// https://phaazon.net/blog/rust-no-drop
    pub fn undo(mut self) -> V {
        let _did_pop = self.0.vec_mut().pop().is_some();
        debug_assert!(_did_pop, "Someone has illicitly popped an element!");
        self.0
    }

    pub fn commit(self) -> V {
        self.0
    }
}

impl<V: VecScoped> VecScoped for Push<V> {
    type Element = V::Element;

    fn vec_mut(&mut self) -> &mut Vec<Self::Element> {
        self.0.vec_mut()
    }
}

#[test]
fn test_scoped_vec() {
    let a = vec![1];
    let mut b = a.pushed(2);
    assert_eq!(&[1, 2], b.vec_mut().as_slice());
    let mut a = b.undo();
    assert_eq!(&[1], a.vec_mut().as_slice());

    let mut a = a.pushed(-2).commit();
    assert_eq!(&[1, -2], a.vec_mut().as_slice());
}

That is a great question, I am not sure! I have put a demo of the &mut-based solution on GitHub here. I thought I wanted to tie each operation to a particular scope, but maybe I could enable more use cases by not tying it to scopes. Right now, the usage looks like:

use scoped_ops::VecScoped;

let mut a = vec![1];
{
    let mut b = a.pushed(2);
    assert_eq!([1, 2], *b);
}  // b drops, and undoes its change

assert_eq!([1], *a);

The purpose of Drop there was just to undo the operations on drop. If that's not required, then all the unsafe goes away in the implementation I provided above since we can simply pull out the fields of the struct when consuming it. This is completely orthogonal to whether or you own the Vec—that change can be trivially done to the existing implementations.

It sounds like you are trying to create an Intrusive Linked List. You might want to check out the intrusive-collections crate for some inspiration.

Thanks for the links! Looking at intrusive-collections, I'm thinking "scoped operations" are a bit similar to "scoped collections," but with a couple differences:

  1. scoped-ops stores the data contiguously, rather than in a linked list. So it's possible to get a slice, whereas in intrusive-collections I believe you would traverse the list with a cursor.
  2. scoped-ops supports any type T: Sized, whereas the intrusive collection requires that the inner type has the ability to store an intrusive collection link.

Am I missing any interesting analogies between "scoped operations" and intrusive collections?

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.