Use struct instance to create default value in case of error

I'm making an asset library that's supposed to hold a tree of nodes. A node can be either an Asset or a Directory:

struct Node {
    name: String,
    data: Content,
}

enum Content {
    Directory(Vec<Node>),
    Asset(Rc<dyn Asset>, usize),
}

I have a create_dir_all method that's supposed to take a path and create all directories. The way I want to do that is to iterate over the components of the path, search for the corresponding node, and if it is found, move on to the next, if I get a NotFound error, create the node and then move on.
My first idea was to do this:

let mut current_node = &mut self.root;
for component in path.components() {
    let name = component.as_os_str().to_str().unwrap();

    current_node = current_node.get_child(name).or_else(|error| match error {
        ErrorKind::NotFound => {
            current_node.create_child(Node::directory(String::from(name)))
        }
        error => Err(error),
    })?;
}

I believe I understand why that doesn't work: the closure has to borrow current_node mutably at the time it's created, so the create_child call is a second mutable borrow.

However, I can't seem to find any way around that. Here's another attempt:

let mut current_node = &mut self.root;
for component in path.components() {
    let name = component.as_os_str().to_str().unwrap();

    current_node = match current_node.get_child(name) {
        Ok(child) => Ok(child),
        Err(error) => match error {
            ErrorKind::NotFound => {
                current_node.create_child(Node::directory(String::from(name)))
            }
            error => Err(error),
        },
    }?;
}

Here's my understanding of this one:
The lifetime of current_node inside the inner match is the same one that's assigned on the get_child call, so the create_child call will be a second mutable borrow.

While researching this, I found that the lifetime of the result will be tied to the lifetime of current_node. If that's the case, is it impossible to match the error and generate a fallback value without using an Rc<RefCell<Node>>? That seems like a very overkill solution to a simple problem which leads me to believe I'm missing something.

Even with a RefCell I run into the same problem. I cannot get this to work even though it feels like it should be a very straightforward thing to do. Do I have some giant mistake in how I structured the data and it just won't be possible no matter what I try? Granted I haven't had a lot of time to work on this lately, but I am completely stuck on it for a month now.

It looks like you've hit one of the big limitations of the current iteration of the borrow checker: it does not recognize a borrow being continued on one arm of a match but terminated on another. There are plans to fix this issue eventually, but in the meantime, it can only be worked around with further refactoring, third-party crates, or unsafe code. Perhaps the most trivial solution is to check twice: call get_child() and discard the reference it returns, call create_child() if the child is not found, then call get_child().unwrap() since you know that a child will be there. Is this sufficient for your code?

3 Likes

Thank you very much for this; it's the answer I needed. I'm happy I had at least a vague idea of what was going on and that it's a limitation of the borrow checker, not me being completely wrong.

Then again, even though I was looking for a workaround, I tunnel-visioned so hard I couldn't figure it out. Your suggestion works perfectly.

This issue drained my motivation for a while, so I'll keep the workaround for now and look into writing some unsafe code to avoid the double call in the future.

let mut current_node = &mut self.root;
        for component in path.components() {
            let name = component.as_os_str().to_str().unwrap();

            if let Err(error) = current_node.get_child(name) {
                match error {
                    ErrorKind::NotFound => {
                        current_node.create_child(Node::directory(String::from(name)))
                    }
                    error => Err(error),
                }?;
            }

            current_node = current_node.get_child(name)?;
        }

Here's one way you might could use unsafe to work around the issue and encapsulates the unsafe inside a single new method get_child_or_else

impl Node {
    // Using an explicit lifetime annotation 'a so we can constrain the reference passed to the closure to have the same lifetime.
    fn get_child_or_else<'a, F, O>(&'a mut self, name: &str, or: F) -> Result<&'a mut Node, O>
    where
        F: FnOnce(&'a mut Self, ErrorKind) -> O,
    {
        // "Forget" the borrow via a raw pointer so we can use self in the return
        let result: Result<*mut Node, ErrorKind> = match &mut self.data {
            Content::Directory(nodes) => match nodes.iter_mut().find(|n| n.name == name) {
                Some(node) => Ok(node),
                None => Err(ErrorKind::NotFound),
            },
            Content::Asset(..) | Content::Sample(..) => Err(ErrorKind::NotFound),
        };

        match result {
            Ok(ptr) => unsafe {
                // Reconstruct the safe reference from the raw pointer.
                // This depends on the lifetime of the return value in the function signature being tied to self in order to be sound.
                Ok(&mut *ptr)
            },
            // The borrow of the child doesn't exist here since the Ok branch wasn't taken, so we can use self freely in this branch.
            Err(err) => Err(or(self, err)),
        }
    }

    fn create_dir_all(&mut self, path: &Path) -> Result<(), ErrorKind> {
        let mut current_node = self;
        for component in path.components() {
            let name = component.as_os_str().to_str().unwrap();

            current_node = match current_node.get_child_or_else(name, |current, err| match err {
                ErrorKind::NotFound => current.create_child(Node::directory(String::from(name))),
                err => Err(err),
            }) {
                Ok(child) => child,
                Err(error) => error?,
            };
        }

        Ok(())
    }
}

It passes miri, and isn't doing anything too exotic so I think it's sound as written.

Here's a full playground

2 Likes

Thank you so much for putting the time into this! It's exactly what I had in mind and I'll work on implementing it now.