Need help creating a Tree/Node struct

I'm new to Rust and facing the steep learning curve. I am trying to learn how to code certain design patterns using Rust structs that I would normally code as a class in Java or Swift. I am struggling with the correct way to code these. I need help fixing the struct as well as how to call it. I would then like to create a function to traverse the tree and log each node.

Below is one example where I am trying to code a tree/node struct. The Node has a string value, an optional reference to a parent Node, and a vector/list of child nodes. I was unsure as to whether the children should also be references or owned.

struct Node<'a> {
    parent: Option<&'a Node<'a>>,
    value: String,
    children:  Vec<&'a Node<'a>>,
}

impl Node<'_> {
    pub fn addNode(&self, child: &Node) {
        self.children.push(child);
    }
}

fn main() {
    println!("Tree nodes");

    let root = Node {
        parent: None,
        value: String::from(""),
        children: Vec<&Node>::new(),
    };

    let n1 = Node {
        parent: Some(&root),
        value: String::from("a"),
    };
    let n2 = Node {
        parent: Some(&n1),
        value: String::from("p"),
    };
    let n3 = Node {
        parent: Some(&n2),
        value: String::from("ple"),
    };

    println!("n value is {}", root.value);
    println!("n1 value is {}", n1.value);
    println!("n2 value is {}", n2.value);
    println!("n3 value is {}", n3.value);
}

You can't use references to build trees or graphs. References are intended only for giving out temporary immutable views into an existing data structure. Among other things, they enforce that the thing they point at cannot be modified for the entire duration in which the reference exists.

As for owned references (i.e. Box), you also cannot use those because of your parent pointers. Ownership cannot be cyclical, but node.children[0].parent would give you back node, which is cyclical.

This leaves you three options:

Remove the parent pointers.

If we remove the parent pointers, then it is no longer cyclical and we can use ownership. Note that Box isn't needed because Vec already puts the elements on the heap.

struct Node {
    value: String,
    children: Vec<Node>,
}

Use reference counting for shared ownership

The Rust language provides a type called Rc that allows for shared ownership.

struct Node {
    parent: Option<Weak<RefCell<Node>>>,
    value: String,
    children: Vec<Rc<RefCell<Node>>>,
}

Unfortunately, the Rc type provides only immutable access to its contents. To modify the inner value, a RefCell is necessary. This strategy can be pretty unpleasant for that reason, and I recommend avoiding it.

(If you don't need to modify the nodes after creating them, then you do not need the RefCell.)

Use integers instead of references

People tend to find this option ugly, but it actually works really well.

struct Graph {
    nodes: Vec<Node>,
}

struct Node {
    parent: Option<usize>,
    value: String,
    children: Vec<usize>,
}

Notice how none of these solutions involve any lifetimes. A struct should never be annotated with a lifetime unless it is a temporary view into an existing data structure.

11 Likes

Thanks, this is the most concise answer to this topic I have found yet. Would you be interested in co-authoring an article about trees and self-referential data structures in rust?

1 Like

I already have some other articles I would like to finish first before I start a new one. You are welcome to write your own article based on my answer if you want to.

3 Likes

Thank you but this now presents a larger challenge. I was using this example to understand how to model relationships between structs much like one would use with classes. e.g. to-one, to-many, optional to-one, optional to-many, many-to-many, etc... I would like to see a cheat sheet on how to model these data structures if one exists.

1 Like

Ok, I will if I get to it. Thanks for the permission.

Good idea

1 Like

The ownership model of Rust makes some ways of structuring your structs much easier than others. For example, in some languages you might have a bunch of objects that point at each other in lots of complicated and cyclical ways. This kind of thing simply doesn't work very well in Rust. Learning to choose a layout for your data that works well with Rust's ownership model is part of the learning curve.

In general, the different ways of storing a value are:

  1. When you have only a single owner of the value, you can store it by ownership. In this case, you simply put the type without any wrappers. (If the type stores itself, you need a Box around it.)
  2. When the value is immutable but has many owners, then you can store it in an Rc. Cloning this type does not clone the data inside it. (If the value is small, cloning the value itself may be a better option.)
  3. When the value is mutable and has many owners (but is not cyclical), then you need an Rc+RefCell combination to store it. Accessing the data can be cumbersome because the RefCell is annoying to work with.
  4. When the value is mutable, has many owners, and lives in a cyclical structure, then you need to use the weak from of Rc to break the cycles.
4 Likes

Hmm, I have found it quite straight-forward. You have borrow() or borrow_mut() calls, but I didn't find it annoying. If anything, RwLock is more annoying! I agree you should avoid RefCell if possible, but if you need Rc/RefCell or Arc/RwLock, well that is what you need, and it works just fine according to my experience. It is a bit tricky to know what is appropriate without knowing what problem is being solved.

An example: lib.rs - source

( Oh, also, there are other ways to break cycles. )

I think Alice is referring to how you need to write Rc<RefCell<T>> instead of a plain T, plus every .borrow_mut() access adds an extra 13 characters. It's not a deal breaker, but (deliberately) not as ergonomic as working with Box<T> via Deref and DerefMut.

I normally treat the need for a Rc<RefCell<_>> or Arc<RwLock<_>> as a hint that my code's ownership story is starting to get complicated and there are extra concerns to be aware of (deadlocks/panics due to re-entrant access, reference cycles, mutation from multiple places, etc.).

1 Like

this article may help you understand how to implement some data struct in rust.

Learn Rust With Entirely Too Many Linked Lists

1 Like

Thank you! That looks awesome. I will check it out.