Building a tree structure

Hi,

I am a beginner on Rust, and it is driving me crazy!

I am trying to build a tree structure for word completion (it is an exercise in rust, not a productions thing, so bare with me). The idea is that each node holds a letter, and a Vec or for the letters of the next letters for all other words. Here is the code I have so far:

#![allow(dead_code)]
#![allow(unused_variables)]

//#[derive(Clone, Copy)]
struct Tree { node: Node }

//#[derive(Clone, Copy)]
struct Node {
    val: char,
    nodes: Vec<Node>,
}

impl Tree {
    fn new() -> Self {
        Tree {node: Node{val: ' ', nodes: Vec::new()}}
    }

    fn insert(&mut self, word: &str) {
        self.node.insert(word) 
    }
}

impl Node {
    fn insert(&mut self, word: &str) {
        let mut cur_node = self;

        for c in word.chars() {
            let next_node_op = Node::get_child_node(&cur_node.nodes, c);

            match next_node_op {
                Some(mut node) => {cur_node = node;},
                None => {
                    let new_node = Node{val:c, nodes: Vec::new()};
                    cur_node.nodes.push(*new_node);
                    cur_node = new_node;
                    }
                }
        }
    }

    fn get_child_node<'a>(v: &Vec<Node>, curr_c: char) -> Option<&mut Node> {
        v.iter_mut().find(|n| n.val == curr_c)
    }
}


fn main() {
}

I get the following compilation error:

error[E0614]: type `Node` cannot be dereferenced
  --> src/main.rs:34:41
   |
34 |                     cur_node.nodes.push(*new_node);
   |                                         ^^^^^^^^^

error[E0308]: mismatched types
  --> src/main.rs:35:32
   |
35 |                     cur_node = new_node;
   |                                ^^^^^^^^ expected `&mut Node`, found struct `Node`
   |
help: consider dereferencing here to assign to the mutable borrowed piece of memory
   |
35 |                     *cur_node = new_node;
   |                     ^^^^^^^^^

I have been playing around with mutability and references for hours, and could not figure this out. Any help would be greatly appreciated.

Thanks!

You're trying to use Rust references as if they were pointers (using objects "by reference"). That doesn't work with Rust's references despite name similarity. &mut is a temporary exclusive borrow, and that's a very specific limited feature, not a general-purpose way to refer to an object.

let new_node = Node{val:c, nodes: Vec::new()};
cur_node.nodes.push(*new_node);
cur_node = new_node;

You can't do this, because new_node is a value, not a borrow. You can't turn a value into a borrow without ensuring it has storage (an owner) that can be proven to live long enough for the borrow to remain valid for the entire scope of the variable holding it.

And because your storage is in cur_node.nodes, which is mutable, it can't guarantee that references to its content remain valid. Every push can reallocate and invalidate all borrows referencing it.

The easy way out is instead of using &mut Node, use Rc<RefCell<Node>> (same as Arc<Mutex<Node>> for multi-threaded case). Rc is a shared reference that also automatically handles ownership, and RefCell allows having mutability with less strict rules about exclusive access.

In languages like Python every reference to every object is essentially like Rc<RefCell<Object>>. Rust just likes to make such overhead painfully obvious.

2 Likes

An alternative that works in this case is to work with vector indexes instead of references until the loop is ready to traverse into the next level of the tree:


impl Node {
    fn insert(&mut self, word: &str) {
        let mut cur_node = self;

        for c in word.chars() {
            let next_node_idx = Node::get_child_idx(&cur_node.nodes, c);

            cur_node = match next_node_idx {
                Some(x) => &mut cur_node.nodes[x],
                None => {
                    let new_node = Node {
                        val: c,
                        nodes: Vec::new(),
                    };
                    cur_node.nodes.push(new_node);
                    cur_node.nodes.last_mut().unwrap()
                }
            }
        }
    }

    fn get_child_idx<'a>(v: &Vec<Node>, curr_c: char) -> Option<usize> {
        v.iter()
            .enumerate()
            .find(|(_, n)| n.val == curr_c)
            .map(|(i, _)| i)
    }
}

(Playground)

2 Likes

Thanks for the detailed explanation.

So would using lifetime attributes help me here, or would the RefCell<rc> be the only option?

At minimum, new_node needs to be moved from the local variable into cur_node.nodes, so you can’t have an & or &mut reference to it when that happens. There’s two ways to get around this:

  • Rc will let you have two independent pointers to the same object. One of them gets moved into the vector and the other one can be used however you like.
  • You can move the node into the vector, and then ask the vector for an &mut reference to it. This will prevent changes to the vector until you’re done with the reference, but avoids the extra heap allocation of Rc.

Lifetime annotations are used to describe the relationship between several references in an interface with a hidden interior (function signatures, for example). That’s not really the case here.

1 Like

Thanks @2e71828, this is helpful, but...

I still cannot make it work. Any chance I can ask you for a sample. The thing I am getting stuck on is this error:

error[E0502]: cannot borrow `cur_node.nodes` as mutable because it is also borrowed as immutable
   --> src/main.rs:121:21
    |
115 |             let next_node_op = Node::get_child_node(&cur_node.nodes, c);
    |                                                     ---------------
    |                                                     |
    |                                                     immutable borrow occurs here
    |                                                     immutable borrow later used here
...
121 |                     cur_node.nodes.push(new_node);
    |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ mutable borrow occurs here

error[E0596]: cannot borrow `*v` as mutable, as it is behind a `&` reference
   --> src/main.rs:129:9
    |
128 |     fn get_child_node(v: &Vec<Node>, curr_c: char) -> Option<&mut Node> {
    |                          ---------- help: consider changing this to be a mutable reference: `&mut std::vec::Vec<Node>`
129 |         v.iter_mut().find(|n| n.val == curr_c)
    |         ^ `v` is a `&` reference, so the data it refers to cannot be borrowed as mutable

And my current code is:

    fn insert(&mut self, word: &str) {
        let mut cur_node = self;

        for c in word.chars() {
            let next_node_op = Node::get_child_node(&cur_node.nodes, c);

            match next_node_op {
                Some(node) => {cur_node = node;},
                None => {
                    let new_node = Node{val:c, nodes: Vec::new(), eow: false};
                    cur_node.nodes.push(new_node);
                    //cur_node = new_node;
                }
            }
        }
    }

    fn get_child_node(v: &Vec<Node>, curr_c: char) -> Option<&mut Node> {
        v.iter_mut().find(|n| n.val == curr_c)
    }

Thanks.

Lifetime annotations don't do anything. They don't generate any code, and they can't make anything live longer. They only explain to the compiler what the code actually does.

In generic code it's often necessary to have correct annotations to help compiler reason about hypothetical code. But in non-generic function the borrow checker can already see it won't work, so you can't annotate your way out of this. You need to change what the code does.

Your get_child_node can't work by Rust's aliasing rules. You can't take a shared/immutable borrow and return a mutable borrow out of it. Borrows are either shared or mutable, but you can never have two mutable borrows of the same thing. This rule is necessary to guarantee thread-safety.

let v = &vec_of_nodes;
let a = get_child_node(v, 'x');
let b = get_child_node(v, 'x');
// Now both a and b exist, and are both mutable. Rust's universe collapses.

That's why 2e71828 suggested using an index. Integers aren't restricted by borrowing rules. You can keep an integer and borrow from the vec later, in a smaller scope.

3 Likes