Cursor vs Position

I'm writing a general, n-ary hierarchical (tree) structure and I'm looking for some design advice. I currently have both a CursorMut and Position struct for the implementation. The Position struct is really just a safe handle to the underlying Node<T> in the vein of a type alias like type Link<T> = Option<NonNull<Node<T>>>;, except I can keep it private(?). I'm wondering if this approach is redundant/provides too much indirection, or if this is the correct approach for the safe/generic API. Either way, I'm curious to hear your opinions about what kinds of operations each of these member structs should have, if any.

The private Node:

struct Node<T> {
    parent: Option<Position<T>>,
    children: Vec<Position<T>>, // Always exists for a Node, even if empty
    data: Option<T>,
}

A sketch of the (mostly implemented/tested) public API:

impl<T> GenTree<T>
pub fn new() -> GenTree<T>
pub fn root(&self) -> Position<T>
pub fn cursor_mut(&mut self) -> CursorMut<'_, T>
// Unimplemented
pub fn cursor_from(&mut self, position: &Position<T>) -> CursorMut<'_, T>
// Unimplemented
pub fn depth(&self, node: &Position<T>) -> usize
// Unimplemented
pub fn height(&self, node: &Position<T>) -> usize

impl<T> Position<T>
pub fn get_data(&self) -> Option<&T>

impl<'a, T> CursorMut<'a, T>
pub fn is_root(&self) -> bool
pub fn num_children(&self) -> usize
pub fn get_data(&self) -> Option<&T>
pub fn add_child(&mut self, data: T)
pub fn delete(&mut self) -> Option<T>
pub fn current(&self) -> &Position<T>
pub fn jump(&mut self, new: &Position<T>)
pub fn ascend(&mut self) -> Result<(), String>
pub fn children(&self) -> Vec<Position<T>>

This allows me to do things like:

        for position in cursor.children() {
            if position.get_data().unwrap().title == "Marine".to_string() {
                cursor.jump(&position);
                deleted = cursor.delete().unwrap();
            }
        }
        let mut tree: GenTree<Heading> = GenTree::<Heading>::new();
        let mut cursor = tree.cursor_mut(); // Sets cursor to tree.root

        // Constructs tree from Vec<T>
        for node in data {
            let data_level = node.level;

            // Case 1: Adds a child to the current parent
            if data_level == cur_level + 1 {
                cursor.add_child(node);
                cur_level += 1;
            }
            // Case 2: Adds a child with multi-generational skips
            else if data_level > cur_level {
                let diff = data_level - cur_level;
                for _ in 1..diff {
                    let empty = Heading::new("[]".to_string(), 0);
                    cursor.add_child(empty);
                    cur_level += 1;
                }
                cursor.add_child(node);
                cur_level += 1;
            }
            // Case 3: Adds sibling to current parent
            else if data_level == cur_level {
                cursor.ascend().ok();
                cursor.add_child(node);
            }
            // Case 4: Adds a child to the appropriate ancestor,
            // ensuring proper generational skips
            else {
                let diff = cur_level - data_level;
                for _ in 0..=diff {
                    cursor.ascend().ok();
                    cur_level -= 1;
                }
                cursor.add_child(node);
                cur_level += 1;
            }
        }

How do you prevent this from being a use-after-free if the GenTree is dropped while the Position is kept? It seems like this would require the nodes to be reference-counted.

Haha this is exactly the kind of glaringly obvious (and overlooked) problem that I was hoping to get perspective on by coming here!

FWIW my implementation produces errors with CursorMut::get_data() too. :thinking:

        let mut outer_tree: GenTree<Heading> = builder::construct(0, one);
        let mut _pos: Position<Heading> = outer_tree.root();
        let mut cursor = outer_tree.cursor_mut();

        {
            let inner_tree: GenTree<Heading> = builder::construct(0, two);
            _pos = inner_tree.root();
            cursor.jump(&_pos);
        }

        // UB: Attempts to access dangling pointer on CursorMut::get_data() 
        // and Position::get_data() :(
        let _oopsie = cursor.get_data();
        let _oopsie = _pos.get_data();

So is Rc/Arc my best bet here or is there a sleeker way to handle this with raw pointers? It seems like this is a perfect use case for an arena-allocated backing structure, which I plan on writing next, but as an exercise I'm still invested in making this pointer-based structure as safe and efficient as possible (within reason, given my current experience level :sweat_smile:).

Arc/Rc is the way to do that.

Note that an arena doesn't solve the fundamental problem that you can't drop the owner of the data (the arena), or deallocate pieces of the arena, while there are active references to it. In general that's what motivates the use of either ref counting (Rc/Arc) or indexes/ids of some kind rather than references (& and &mut) when sharing objects.

1 Like

Update: I was able to noodle around with Rc and RefCell today and got it working, I think, but boy is it painful. Im not familiar enough to judge what kind of impact returning Ref types has. Im also not used to sprinkling so many clone()s around, which felt unnatural, even if its just a pointer clone.

Position now has a public function.

pub struct Position<T> {
    ptr: Option<Rc<RefCell<Node<T>>>>,
}
impl<T> Position<T> {
    pub fn get_data(&self) -> Option<Ref<T>> {
        let node_ref: Ref<Node<T>> = self.ptr.as_ref()?.borrow();
        Ref::filter_map(node_ref, |node| node.data.as_ref()).ok()
   }
...
}

And Node now has weak parent pointers to avoid reference cycles.

struct Node<T> {
    parent: Option<Weak<RefCell<Node<T>>>>,
    children: Vec<Position<T>>, // Always exists for a Node, even if empty
    data: Option<T>,
}

The CursorMut still feels a little wonky.

impl<'a, T> CursorMut<'a, T>
pub fn is_root(&self) -> bool
pub fn is_some(&self) -> bool
pub fn is_none(&self) -> bool
pub fn num_children(&self) -> usize
pub fn get_data(&self) -> Option<Ref<'_, T>>
pub fn add_child(&mut self, data: T)
pub fn delete(&mut self) -> Option<T>
pub fn current(&self) -> &Position<T>
pub fn jump(&mut self, new: &Position<T>)
pub fn ascend(&mut self) -> Result<(), String>
pub fn children(&self) -> Vec<Position<T>>