Implementing a very basic trie


#1

I’m trying to implement a very basic trie with a positive value stored at every node, -1 if no value is present.

Here is a simple insertion code that i wrote.


struct Node {
    chars: HashMap<char, Node>,
    val: i32,
}

struct Trie {
    root: Node,
}
 
impl Trie {

fn add_string(&mut self, string: String, val: i32) {
        let mut current_node = &mut self.root;
        for c in string.chars(){
            let new_node = & mut (current_node.chars.entry(c).or_insert(Node {
                chars: HashMap::new(),
                val: -1,
                });
            current_node = new_node;
         }
        current_node.val = val;
        }
    }
}

Obviously this doesn’t work because of borrow issues. What is the best way to do this trivial case of insertion into trie, without recusrion.


#2

Amusingly, the functional approach with Iterator::fold seems more amicable for the borrow checker:

fn add_string(&mut self, string: String, val: i32) {
    let mut last_node = string.chars().fold(&mut self.root, |current_node, c| {
        current_node.chars.entry(c).or_insert(Node {
            chars: HashMap::new(),
            val: -1,
        })
    });
    last_node.val = val;
}

#3

You can make this work with imperative code too, but you need to know about a trick. First, here’s the code:

use std::collections::HashMap;

#[derive(Debug)]
struct Node {
    chars: HashMap<char, Node>,
    val: Option<i32>,
}

#[derive(Debug)]
struct Trie {
    root: Node,
}

impl Trie {
    fn new() -> Trie {
        Trie {
            root: Node {
                chars: HashMap::new(),
                val: None,
            },
        }
    }
    
    fn add_string(&mut self, string: String, val: i32) {
        let mut current_node = &mut self.root;
        for c in string.chars() {
            current_node = moving(current_node).chars
                .entry(c)
                .or_insert(Node {
                    chars: HashMap::new(),
                    val: None,
                });
        }
        current_node.val = Some(val);
    }
}

fn moving<T>(t: T) -> T { t }

fn main() {
    let mut trie = Trie::new();
    trie.add_string("foo".to_string(), 1);
    trie.add_string("foobar".to_string(), 2);
    println!("{:#?}", trie);
}

Playground: https://is.gd/EHRe8Q

The trick is to force Rust to move current_node instead of reborrowing it. The moving function (which is just the identity function) does just that. @bluss describes it wonderfully here: http://bluss.github.io/rust/fun/2015/10/11/stuff-the-identity-function-does/


#4

(Unrelated to the question) A few suggestions to make your code better and more idiomatic:

You don’t have to do tricks like this in Rust, 'cause Rut has first-class optional values. You can use Option<u32> and be sure that you (a) never forget to check for a missing value (b) never accidentally pass a negative number where it has a special value.

Next, it is often super convenient to implement (or derive) the Default trait for your type. You can then just use Node::default() inside or_insert() call or in the new() method.

Finally, if a function just “reads” a string – does not need to modify it or take over the ownership – make the function take a &str (a string slice) instead of a String. If a function takes a String, each caller has to allocate a copy of the string on the heap (if it hasn’t an extra one already) and pass it to the function, only to have that function “read”/“look at” the string and discard (de-allocate) it. With &str, the function only gets a reference to an existing string buffer (or to a static string built into the executable file) – just enough to “read” it, and no need to allocate an extra copy for the caller.

With these changes, the code posted by @BurntSushi becomes (https://is.gd/3FBdGv):


use std::collections::HashMap;

#[derive(Debug, Default)]
struct Node {
    chars: HashMap<char, Node>,
    val: Option<u32>,
}

#[derive(Debug)]
struct Trie {
    root: Node,
}

impl Trie {
    fn new() -> Trie {
        Trie { root: Node::default() }
    }

    fn add_string(&mut self, string: &str, val: Option<u32>) {
        let mut node = &mut self.root;
        for c in string.chars() {
            node = moving(node).chars
                .entry(c)
                .or_insert(Node::default());
        }
        node.val = val;
    }
}

fn moving<T>(t: T) -> T { t }

fn main() {
    let mut trie = Trie::new();
    trie.add_string("foo", Some(1));
    trie.add_string("foobar", Some(2));
    println!("{:#?}", trie);
}