Lifetime troubles with libgit2


#1

I’m attempting to use @alexcrichton’s libgit2 bindings as an introductory project, and my first goal was a classic depth-first traversal of a Git object tree. Since lifetimes in the library are tied in interesting ways to the root “Repository” object, I’m having a lot of trouble satisfying the borrow checker. The code I’m using is:

use git2::*;

// Iterates through all leaf nodes in a depth-first traversal.
pub struct TreeWalker<'t, 'r> {
    it_stack: Vec<TreeWithIter<'t>>, // Used as a stack for depth-first traversal
    repo : &'r Repository, // Needed for converting tree entries to objects
}

// For each subtree we're currently visiting, we need the subtree itself
// and an iterator to pass through it.
struct TreeWithIter<'t> {
    tree: Tree<'t>,
    it: TreeIter<'t>,
}


impl<'t, 'r> Iterator for TreeWalker<'t, 'r> {
    type Item = TreeEntry<'t>;

    fn next(&mut self) -> Option<TreeEntry<'t>> {
        // Perform a classic depth-first traversal of the tree.
        loop {
            let next;
            {
                // If we're out of iterators, we're done.
                let back_iterator = self.it_stack.last_mut();
                if back_iterator.is_none() { return None; }

                // Otherwise, grab the back one's next element
                let back_iterator = back_iterator.unwrap();
                next = back_iterator.it.next();
            }

            let next = match next {
                // If this iterator is empty, pop it.
                None => { self.it_stack.pop(); continue; },
                // Otherwise, continue with its element.
                Some(e) => e
            };

            if next.kind().unwrap() == ObjectType::Tree {
                // If it's a tree, push it onto the stack and start walking it.
                // HERE BE DRAGONS: TreeElement::to_object gives us an object
                // with the lifetime of the repo
                let subtree = match next.to_object(&self.repo)
                    .unwrap()
                    .into_tree() {
                    Ok(t) => t,
                    Err(_) => panic!("Couldn't unwrap TreeEntry to Tree")
                };

                self.it_stack.push(TreeWithIter{ tree: subtree, it: subtree.iter()});
                continue;
            }
            else {
                // Otherwise we can return it as usual.
                return Some(next);
            }
        }
    }
}

pub fn walk<'t, 'r>(t: Tree<'t>, r: &'r Repository) -> TreeWalker<'t, 'r> {
    TreeWalker{ it_stack: vec![TreeWithIter{ tree: t, it: t.iter()}], repo: r }
}

which yields the error:

src/tree.rs:47:42: 47:51 error: cannot infer an appropriate lifetime for lifetime parameter 'a in function call due to conflicting requirements [E0495]
src/tree.rs:47                 let subtree = match next.to_object(&self.repo)
                                                        ^~~~~~~~~
src/tree.rs:22:5: 62:6 help: consider using an explicit lifetime parameter as shown: fn next(&mut self) -> Option<TreeEntry<'r>>

What I think the compiler is telling me is that the returned tree entries must have the lifetime r since TreeEntry::to_object returns an object with the lifetime of the repository reference it’s given, but I’m not quite sure how to express that. I mostly understand how explicit lifetimes work, but they seem to be mixing in subtle and confusing ways here (or I’m just being thick).

Also, feel free to pick apart anything that isn’t idiomatic - this is my first non-trivial Rust project and I’m sure it has some blemishes.


#2

Unfortunately, your TreeWithIter cannot exist, because the TreeIter holds borrows to the Tree object, you have a potentially unbounded number of Tree objects with different lifetimes, and a single function activation can only create a finite number of lifetimes.

Generally, data structure operations using an explicit stack are painful in Rust, and often require subverting the borrow checker with mem::transmute; I recommend using a recursive depth-first traversal instead, it should be much easier (and given that you’re doing IO, the performance cost of recursion is probably negligible?)

Also, it would be clearer (and may have revealed the issue earlier, or maybe not) to use lifetime names consistent with the libgit2 documentation. The lifetime parameter of Tree in libgit2 is Tree<'repo>, so 'r not 't.


#3

I’m not sure I follow - while the sub Trees (and the associated TreeWithIters) will have different lifetimes, aren’t they well-expressed by moving them onto the stack at the start of their lives and popping them at the end? What part of this trips up the borrow checker? At the risk of gross oversimplifications, does the borrow checker “think” in terms of function inputs and outputs, and have more trouble reasoning with the transfer of lifetimes into and out of data structures?

Hmm, that would rule out creating an iterator, wouldn’t it? If I want to keep the traversal general-purpose, would the next best approach be calling some “visitor” function on every node?


#4

The part where you’re using a stack you manage yourself. The only stack the borrow checker understands is the stack of function calls; the borrow checker can’t see that it_stack is being managed as a stack, so it has to assume that all elements have the same lifetime which ends every time the variable is modified.

Yes. (Unless you use unsafe code, which might be a good idea for later…)

Yes.