DAWG data structure - moving from C to Rust

I've been writing an algorithm for creating DAWG data structures in an online fashion (more about that in paper "Incremental Construction of Minimal Acyclic Finite-State Automata" by Jan Daciuk & al.).

I've implemented the algorithm in C and arriving at the end the problem I had is memory not getting properly freed because of joint nodes in my DAFSA. Anyway, I'm now trying to implement that in Rust because I know that I can have smart pointers like Rc that will do the trick.

So here are my structs:

type SharedDawgState = Rc<RefCell<DawgState>>;

struct DawgState {
    transitions: BTreeMap<char, SharedDawgState>
    //...
}

struct Dawg {
    start: SharedDawgState,
    //...
}

This works pretty much as a Trie, I start that Dawg.start and follow the transitions in DawgState. So, to get the longest prefix of a string that is already in the Trie, I do the following:

fn common_prefix(&self, word: &String) -> Option<String> {
    if (*self.start).as_ptr().is_null() {
        return None;
    }

    let mut current = self.start.as_ptr();
    let mut prefix = String::from("");

    for c in word.chars() {
        unsafe {
            if !(*current).transitions.contains_key(&c) {
                break;
            }

            current = (*current).transitions.get(&c)?.as_ptr();
        }

        prefix.push(c);
    }

    Some(prefix)
 }

But in my opinion this looks weird. I cannot say if I'm being influenced by the way I write C code (following a pointer through) or if this is a common way to use the Rc<RefCell<>>.

So, I wanted to know if there is a safe way to go down this Trie and access the ShareDawgState so that I can either do something like common_prefix or simply return a reference to the Trie node.

1 Like

You already have Rcs wrapping all of the elements so you can just clone them and borrow into the RefCell rather than falling back on raw pointers. That allows RefCell to do it's runtime validity checks, which as_ptr can't enforce. [1]

Playground

fn common_prefix(&self, word: &String) -> Option<String> {
    if (*self.start).as_ptr().is_null() {
        return None;
    }

    let mut current = self.start.clone();
    let mut prefix = String::from("");

    for c in word.chars() {
        if !current.borrow().transitions.contains_key(&c) {
            break;
        }

        let next = current.borrow().transitions.get(&c)?.clone();
        current = next;

        prefix.push(c);
    }

    Some(prefix)
}

  1. Hence the need for unsafe in that version ↩ī¸Ž

2 Likes

doesn't the cloning increment the Rc ref count ? will it decrement at the end of the scope/function ?

When the Rc is dropped the refcount will be decremented. The previous value of current will be dropped after a new value is assigned to current. So you shouldn't have any problems with unbalanced refcounts.

The final value in current will be dropped at the end of the function when the variable goes out of scope.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.