Questionable Smart Pointer and Aliasable Mutability

zachmatson · July 8, 2020, 12:35am

I was trying to implement Monte Carlo Tree Search, with some methods on the nodes in my tree data structure accessing a child node, continuing to select child nodes to arbitrary depth, and eventually one child node backpropagating some data up through all of its parent nodes, which are all mutated in this process. The tree structure also gets pruned in a way that some parent nodes will be deleted and all of that parent's children will also need to be deleted, which I thought made it a poor fit for some arena memory approach.

I ended up implementing this with this smart pointer, but it allows for aliasable mutability. Is this type of setup bad practice?

use std::ptr::NonNull;

/// Smart pointer to use in tree data structures  
/// The `TreePtr` type is a strong pointer which can be used safely  
/// This type owns its contents, and can be used for parent nodes to
/// own their children
///
/// Will drop/cleanup its contents when out of scope.
pub struct TreePtr<T> {
    target: Option<NonNull<T>>,
}

/// Weak version of `TreePtr`, does not own its contents
/// Can be used by nodes to refer to themselves or their parent
///
/// For safe usage, the weak pointer should be used for self-referential pointing
/// or for pointing to a structure which owns the strong version of the pointer.  
/// This ensures the required behavior of the parent pointer being dropped before the weak version.
///
/// Allows `Send` but is not thread safe. Use wisely.
pub struct TreePtrWeak<T> {
    target: Option<NonNull<T>>,
}

impl<T> Drop for TreePtr<T> {
    fn drop(&mut self) {
        if let Some(target) = self.target {
            unsafe {
                Box::from_raw(NonNull::as_ptr(target));
            }
        }
    }
}

impl<T> TreePtr<T> {
    /// Get a strong `TreePtr` with no contents  
    /// A null `TreePtr` cannot be dereferenced, and will panic  
    /// Therefore, you must ensure to immediately change this pointer to refer to something
    /// before attempting to dereference it
    pub unsafe fn null() -> Self {
        TreePtr { target: None }
    }

    /// Allocate `val` on the stack and return a strong `TreePtr` to it
    pub fn new(val: T) -> Self {
        // Will always be some, pointer isn't null
        let target = NonNull::new(Box::leak(Box::new(val)));
        TreePtr { target }
    }

    /// Get weak `TreePtrWeak` corresponding to strong `TreePtr`  
    /// Cannot be safely dereferenced after `ptr` is dropped  
    /// To use safely, `ptr` should be owned by the same structure as the returned `TreePtrWeak`,  
    /// or the returned `TreePtrWeak` should be otherwise indirectly owned by the owner of `ptr`
    pub fn downgrade(ptr: &Self) -> TreePtrWeak<T> {
        TreePtrWeak { target: ptr.target }
    }
}

impl<T> Clone for TreePtrWeak<T> {
    /// Create a new `TreePtrWeak` to the same underlying data without copying that data  
    /// Ensure the cloned pointer meets the same safety requirements with respect to the parent
    /// `TreePtr` owning the underlying data.
    fn clone(&self) -> Self {
        unsafe {
            TreePtrWeak {
                target: self
                    .target
                    .and_then(|ptr| Some(NonNull::new_unchecked(ptr.as_ptr()))),
            }
        }
    }
}

impl<T> std::ops::Deref for TreePtr<T> {
    type Target = T;

    /// Dereference the pointer
    ///
    /// ### Panics
    /// This operation will panic if used on the output of `TreePtr::null()`
    /// Instead, replace that pointer with something else before dereferencing
    fn deref(&self) -> &Self::Target {
        unsafe {
            &*self
                .target
                .expect("'Null' CleanupPtr dereferenced")
                .as_ptr()
        }
    }
}

impl<T> std::ops::DerefMut for TreePtr<T> {
    /// Mutably dereference the pointer
    ///
    /// ### Panics
    /// This operation will panic if used on the output of `TreePtr::null()`
    /// Instead, replace that pointer with something else before dereferencing
    fn deref_mut(&mut self) -> &mut Self::Target {
        unsafe {
            &mut *self
                .target
                .expect("'Null' CleanupPtr dereferenced")
                .as_ptr()
        }
    }
}

impl<T> TreePtrWeak<T> {
    /// Unsafely dereference a `TreePtrWeak`  
    /// This is **undefined behavior** if the parent `TreePtr`  
    /// the `TreePtrWeak` was cloned from has been dropped.  
    /// You must ensure the `TreePtrWeak` is dropped with its parent,  
    /// and therefore cannot be dereferenced after its parent is dropped.
    ///
    /// ### Panics
    /// This operation will panic if the parent `TreePtr` was the output of `TreePtr::null()`
    pub unsafe fn deref(&self) -> &T {
        &*self
            .target
            .expect("'Null' CleanupPtrWeak dereferenced")
            .as_ptr()
    }

    /// Unsafely mutably dereference a `TreePtrWeak`  
    /// This is **undefined behavior** if the parent `TreePtr`  
    /// the `TreePtrWeak` was cloned from has been dropped.  
    /// You must ensure the `TreePtrWeak` is dropped with its parent,  
    /// and therefore cannot be dereferenced after its parent is dropped.
    ///
    /// ### Panics
    /// This operation will panic if the parent `TreePtr` was the output of `TreePtr::null()`
    pub unsafe fn deref_mut(&mut self) -> &mut T {
        &mut *self
            .target
            .expect("'Null' CleanupPtrWeak dereferenced")
            .as_ptr()
    }
}

unsafe impl<T> Send for TreePtrWeak<T> {}

For additional clarification, the pointer was used only as an element of this struct, where children are always dropped when their parents are dropped because their parents own the strong pointers:

pub struct MCTSNode {
    pub parent: Option<TreePtrWeak<MCTSNode>>,
    pub this: TreePtrWeak<MCTSNode>,
    pub children: Vec<TreePtr<MCTSNode>>,
    ...
}

Hyeonu · July 8, 2020, 12:43am

Why don't you use standard [A]Rc/Weak pair in stdlib for it? The pointers are aliased so you should care about concurrency before mutating nodes.

zachmatson · July 8, 2020, 12:46am

Mainly worried about the overhead of reference counted pointers. I liked the strong/weak aspect but it seems like if you can know there is only one strong pointer and the weak pointers don't outlive it, then it is better to avoid runtime costs to ensure the same.

Hyeonu · July 8, 2020, 1:00am

What exact cost are you worrying here? For memory, yes it takes 2 usizes per allocation. For time, it doesn't take any cost if you don't clone/drop the ptr.

Anyway, you shouldn't make it faster before make it work correctly. Writing such smart pointer correctly is hard task. Why don't you exploit exiting trusted component first and optimize later with proper benchmark?

CAD97 · July 8, 2020, 2:24am

You can't mutate behind this pointer, ~~because Box::leak gives you &_~~ [EDIT: misremembered and didn't check]. You want to use Box::into_raw here, not Box::leak.

[EDIT: Box::into_raw_non_null is deprecated in 1.44 (and still unstable), recommending to "use Box::leak(b).into() or NonNull::from(Box::leak(b)) instead". Note to self: always double check these things.]

There is nothing strictly wrong with this API, but it is definitely not easy to use. Most tree impls that aren't using an arena stick to using raw pointers directly or Option<NonNull<T>> to avoid giving false impressions that they're safer than they are.

Rather than having parent pointers in your tree, have you considered building a stack of visited nodes while walking down the tree? Then the algorithm implementation just walks back up the trace of how you walked down the tree and applies any necessary adjustments.

Hyeonu · July 8, 2020, 2:35am

Box::leak gives &'a mut T. It seems OP uses it instead of Box::into_raw to avoid null check at constructing NonNull<T>. Box::into_raw_non_null is not stable yet, so I don't think it's a bad usage of Box::leak for this purpose.

s3bk · July 8, 2020, 9:03am

One trick I have used previously is to avoid a parent reference alltogether, and track the parent-chain at runtime when traversing the graph. However it does mean that you can't reference nodes anymore.
If you do need to reference nodes, Rc is probably the only choice to avoid memory bugs.

Yandros · July 8, 2020, 4:47pm

Just remove the DerefMut, which is something you wouldn't have had with the safer alternative of using [A]Rc + Weaks. Then, to be able to mutate your pointers, you will need to use Cell (or somehting like AtomicCell for the thread-safe variant).

I agree with @Hyeonu that you should start by using Rc/Arc, and only going down the raw pointer path afterwards. This should ensure that you don't accidentally add functionality (such as &mut accesses) that weren't possible to soundly have to begin with.

zachmatson · July 9, 2020, 3:30pm

Thanks to everyone for these comments. This was just a little toy project but I can see now that I shouldn't use this approach again

system · October 7, 2020, 3:30pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Trying to remove unsafe from the code help	4	678	January 10, 2021
Smart mutable/immutable reference using PhantomData? help	2	611	January 12, 2023
Designing a Cursor for tree traversal and mutation help	3	838	April 30, 2020
Mutable reference to tree descendent help	3	495	July 13, 2022
Iterative traversal of Binary Tree & Interior Mutability help	3	274	August 5, 2023

Questionable Smart Pointer and Aliasable Mutability

Related Topics