Newbie learning how to deal with the borrow checker

Hi,

I'm new to rust. I've gone through most of the book and rust by example, and all of rustlings. For some hands-on practice, I've been doing problems from https://www.techiedelight.com/

Some of them are quite easy. Then I decided to up my rust abilities and try one that involved building a binary tree. I was proud of myself when I made a "class" that could add nodes and print the contents. But the problem (https://www.techiedelight.com/invert-alternate-levels-perfect-binary-tree/) basically requires me do to a level order search of a mutable tree.

Questions:

  1. Is there a way to do this procedurally? I'd be ok learning unsafe if necessary.
  2. I want a procedural implementation, but for fun, it would be interesting to also do a functional one if it isn't too difficult, but I can't think of how to do it (it's been years since I've written a large program in a functional language).

Some approaches attempted or I've thought of:

  • Do an in-place modification of the tree. I can't think of a way and keep the borrow-checker happy. The only way I can think of doing this is to have a mut ref to the nodes in a data structure queue/stack/vec/deque that is referenced in a loop. In my non-mut implementation (for printing the tree), I could use two vec's and it worked fine. For the mut version, it didn't like that. So I thought I would be clever and use a deque to keep everything in one place, but then it yelled at me when I referenced it one the second pass in the loop.
  • Do a "copy-constructor". This is probably the more "functional" solution to the problem. However, I can't think of a way to create a node in a given spot in the tree.

Ideas:

  • create a parent "pointer" to the parent node in the tree. Then I could easily walk the tree in a variety of ways. But, I don't know how to do this in Rust. Could I use an Rc and have essentially a doubly liked list between child and parent? I've heard rust Rc doesn't like loops.

Code:

// Solves: https://www.techiedelight.com/invert-alternate-levels-perfect-binary-tree/

// This is a fairly advanced Rust exercise (for me)

// Good reference for contstant tree data structures in Rust:
//   https://endler.dev/2017/boxes-and-trees/
// Good reference for dynamic data structures:
//   https://rust-unofficial.github.io/too-many-lists/

use std::mem;

struct BinTree {
    value:i32,
    left: Option<Box<BinTree>>,
    right: Option<Box<BinTree>>,
}

impl BinTree {

    //#![allow(dead_code)]
    fn new(value:i32) -> BinTree {
        BinTree {
            value: value,
            left: None,
            right: None, 
        }
    }

    // add a value to the BinTree.  The value will be added to the tree as a Binary Search Tree.
    // If all values are added using this method, then the tree will be a correct BST.
    fn add_sorted(&mut self, new_value:i32) {
        if new_value < self.value {
            // add left
            match self.left {
                Some(_) =>  {
                // https://rust-unofficial.github.io/too-many-lists/second-peek.html
                    self.left.as_mut().map(|node| {
                        node.add_sorted(new_value);
                    });
                }
                None => self.left = Some(Box::new(BinTree::new(new_value))),
            }
        } else {
            // add right
            match self.right {
                Some(_) => {
                    self.right.as_mut().map(|node| {
                        node.add_sorted(new_value);
                    });
                }
                None => self.right = Some(Box::new(BinTree::new(new_value))),
            }
        }
    }


    // print the tree in a nice level-order traversal.
    fn pretty_print(&self) {
        // TODO: implement Display? or the Debug display?
        let mut current_level = 0;
        println!("(note: nodes are not spaced and do not line up with levels)");
        print!("{}: ", current_level);
        for node in self {
            if node.level > current_level {
                current_level = node.level;
                print!("\n{}: ", node.level);
            }
            
            if node.has_left_child {
                print!("/");
            }
            print!("{}", node.value);
            if node.has_right_child {
                print!("\\");
            }
            print!(" ");
        }
        println!("");
    }

	//fn swap_children_nodes(&mut self) {
		//mem::swap(&mut self.left, &mut self.right);
	//}

	// WRONG implementation.  Doesn't solve problem correctly.
	//   But it compiles and runs well.
//	fn invert_alternating_levels(&mut self, level: u32) {
//		self.swap_children_nodes();
//
//		// Using Option to get a mutable reference so we can call
//		// our recursive function.
//		self.left.as_mut().map(|node| {
//			node.invert_alternating_levels(level+1);
//		});
//		self.right.as_mut().map(|node| {
//			node.invert_alternating_levels(level+1);
//		});
//	}

    // can I do a mutable level order traversal??  Attempt 5 or 6 now...
    // I think the trick is to put the data into a single data structure,
    // keep the data structure internal to the function, and use indexes
    // instead of iterators.  The indexes return Option, which allow
    // mutable access.
    #[deprecated()]
    #[allow(dead_code)]
    fn invert_alternating_levels(&mut self) {
        let mut _level = 0;

        let mut data: std::collections::VecDeque<&mut BinTree> = std::collections::VecDeque::new();
        let mut num_current: usize = 0;
        let mut num_next: usize = 0;

        data.push_front(self);
        num_current += 1;

        while num_current > 0 {
            // no iterators; this is intentional
            for i in (std::ops::Range {start:0, end: num_current}) {
                data[i].value = 1;
                data[i].left.as_mut().map(|_child| {

                    // Can't call due to borrow checker!!
                    //data.push_back(child);
                });

                unimplemented!();
            }

            while num_current > 0 {
                data.pop_front();
                unimplemented!();
            }

            num_current = num_next;
            num_next = 0;
            _level += 1;
        }
    }

    // attempt 6 or 7.  Do a non-mutable level order traversal just
    // like pretty_print() then return a new tree with the correct format.
    // I wish I could to a in-place change of the current tree, but I
    // couldn't figure out how.  This has the same O() runtime and only
    // a constant amount of extra memory, but it feels unnecessary.
    #[deprecated()]
    #[allow(dead_code)]
    fn create_tree_with_inverted_alternating_levels(&self) -> Self {
        let mut _level = 0;
        let mut current_level = Vec::new();
        let mut next_level = Vec::new();

        current_level.push(self);
        while current_level.len() > 0 {
            for node in current_level.iter() {
                // THE PROBLEM WITH THIS CODE:
                // I have no idea how to force the construction
                // of the new tree in the correct layout...

                if let Some(left_child) = &node.left {
                    next_level.push(&**left_child);
                }
                if let Some(right_child) = &node.right {
                    next_level.push(&**right_child);
                }
            }

            current_level.clear();
            mem::swap(&mut current_level, &mut next_level);
            _level += 1;
        }
        let b = BinTree::new(self.value);
        b
    }

}

#[allow(dead_code)]
#[derive(Debug)]
// This is the type currently returned by BinTree iterator.
// Maybe I should be returning actual BinTree, but this is fine for experimentation.
// A mut iterator should point to the actual thing.
struct LevelOrderSearcherData {
    level: u32,
    has_left_child: bool,
    has_right_child: bool,
    value: i32,
}

#[allow(dead_code)]
struct LevelOrderSearcher {
    data: Vec<LevelOrderSearcherData>,
}

#[allow(dead_code)]
impl LevelOrderSearcher {
    // perform level order search and put data into a vector which can be iterated through.
    fn new(root: &BinTree) -> Self {
        let mut data: Vec<LevelOrderSearcherData> = Vec::<LevelOrderSearcherData>::new();

        let mut level = 0;
        let mut current_level = Vec::new();
        let mut next_level = Vec::new();

        current_level.push(root);
        while current_level.len() > 0 {
            for node in current_level.iter() {
                
                let d = LevelOrderSearcherData {
                    level: level,
                    has_left_child: node.left.is_some(),
                    has_right_child: node.right.is_some(),
                    value: node.value,
                };
                data.push(d);

                if let Some(left_child) = &node.left {
                    next_level.push(&**left_child); // added * and & until compiler was happy.  I almost understand what is going on.
                }
                if let Some(right_child) = &node.right {
                    next_level.push(&**right_child);
                }
            }

            current_level.clear();
            mem::swap(&mut current_level, &mut next_level);
            level += 1;
        }

        Self {
            data: data,
        }
    }

}

impl IntoIterator for BinTree {
    type Item = LevelOrderSearcherData;
    type IntoIter = std::vec::IntoIter<Self::Item>;

    // iterates over the items, moving them into the new scope
    fn into_iter(self) -> Self::IntoIter {
        let level_order_searcher = LevelOrderSearcher::new(&self);
        level_order_searcher.data.into_iter()
    }
}

impl IntoIterator for &BinTree {
    type Item = LevelOrderSearcherData;
    type IntoIter = std::vec::IntoIter<Self::Item>;
    fn into_iter(self) -> Self::IntoIter {
        let level_order_searcher = LevelOrderSearcher::new(self);
        level_order_searcher.data.into_iter()
    }
}


fn main() {
    // create a perfect sorted tree
    let mut bin_tree = BinTree::new(8);
        bin_tree.add_sorted(4);
            bin_tree.add_sorted(2);
                bin_tree.add_sorted(1);
                bin_tree.add_sorted(3);
            bin_tree.add_sorted(6);
                bin_tree.add_sorted(5);
                bin_tree.add_sorted(7);
        bin_tree.add_sorted(12);
            bin_tree.add_sorted(10);
                bin_tree.add_sorted(9);
                bin_tree.add_sorted(11);
            bin_tree.add_sorted(14);
                bin_tree.add_sorted(13);
                bin_tree.add_sorted(15);

    bin_tree.pretty_print();

    println!("Hello, world!");
}

Generally linked lists and trees are among the most difficult data structures to implement in rust.

Which is one reason I picked this problem. : )

I've learned a lot so far in the process.

The https://rust-unofficial.github.io/too-many-lists/ link in your code pretty much contains the answers to all your questions. In particular, it covers the "functional" approach (if you mean what I think you mean, that's the "persistent queue" chapter), and and the "procedural" approach using unsafe. Generalizing the code from linked lists to trees should be straightforward.

In general, data structures are one of the very few kinds of code that simply aren't compatible with Rust's borrow checker. Which is fine. Part of the reason unsafe exists is that the various Rust language rules for enforcing memory safety are meant to be a tradeoff between allowing useful code, being simple enough to reason about in practice, and general enough to compose well with everything else in the language. Not all sound code can be proven sound at compile time anyway (must less would be worth re-running the proof every single time you wanted to compile it), and there's really no easy or general way to prove soundness of all the interesting data structures one might want to write in Rust (short of literally writing mathematical proofs in Coq or something).

I think all of the smaller questions in your post are covered by the above, so I'll only add one more thing:

I've heard rust Rc doesn't like loops.

Quite the opposite. Cyclic data structures are one of the primary reasons you might want an Rc.

2 Likes

Thanks. I took another look at the "Unsafe Queue" section and now I see how it relates to my implementation. Now I have a working implementation. Thank you for correcting my understanding of Rc, too. Maybe later I'll take a crack at making an Rc BinTree for practice too.

If anyone has any suggestions to improve my Rust for this code, please let me know.

    fn invert_alternating_levels(&mut self) {
        let mut level = 0;
        let mut current_level: Vec<*mut BinTree> = Vec::new();
        let mut next_level: Vec<*mut BinTree> = Vec::new();

        current_level.push(self);
        while current_level.len() > 0 {

            for i in (std::ops::Range{start: 0, end: current_level.len()}) {
                let i_ptr = current_level.get(i).unwrap();

                unsafe {
                    if ((level % 2) == 1) && (i < (current_level.len() / 2)) {
                        let swap_ptr = current_level.get(current_level.len() - 1 - i).unwrap();
                        let tmp_value = (*(*swap_ptr)).value;
                        (*(*swap_ptr)).value = (*(*i_ptr)).value;
                        (*(*i_ptr)).value = tmp_value;
                    }

                    (*(*i_ptr)).left.as_deref_mut().map(|child| {
                        next_level.push(child);
                    });
                    (*(*i_ptr)).right.as_deref_mut().map(|child| {
                        next_level.push(child);
                    });
                }
            }

            // Note: this is the only place where the len() of current_level is altered.
            current_level.clear();
            mem::swap(&mut current_level, &mut next_level);
            level += 1;
        }
    }

}

I very strongly disagree with this sentiment.

It's perfectly possible to write most useful data structures in entirely safe Rust, using the regular ownership and borrowing model, in a straightforward way, even as elegantly as it is possible to write other kind of Rust code.

Most of the problems pop up when one wants particular kinds of optimization that require clever trickery, owing to the internal invariants of a particular data structure. For example, one can easily write a doubly-linked list in Rust, provided you use Rc and Weak. If you insist on using the additional invariant that it's always a list, i.e. a node only ever points to the previous and next nodes, you can do it slightly more efficiently using unsafe, but it's not required. The same holds with regards to trees and general graph data structures as well.

In the light of this, let me provide an implementation (playground link) of OP's problem. Note @dv4512 that I have cleaned up the code:

  • I have removed many redundant, unidiomatic parts (e.g. checking if a pointer is Some immediately followed by a map which does the same)
  • I have simplified the pretty-printing code, so that it actually outputs the tree in a visually correct form, albeit from left to right (rather than top to bottom). This aids debugging.
  • I have removed everything except the solution of the problem, the pretty printing function, the construction of the tree, and the example in main().
  • I have renamed the types and functions so that they have more idiomatic names.
  • I have introduced a generic type parameter into the tree node struct, so you can abstract over any type in the future.

More importantly, my implementation contains zero unsafe , it has a time complexity of O(n) (it visits each node thrice) and a peak additional memory overhead of O(n) (the number of nodes in the last two levels times size_of::<&Node> , i.e., in practice a ~constant fraction of the memory required by the tree itself).

The insight is the following. When working with an algorithmic problem, the difficulty almost always arises because you are using the wrong kind of representation. The most elegant (and correct) implementations usually entail transforming the problem into a representation or "space" (so to say) where it's easy to solve, solving it there, and transforming the solution back. This is exactly what I have done here.

The problem as well as the solution requires operating on entire levels of the tree at once, which is quite an unnatural thing to do with a tree. The naturally-induced recursion structure on the tree is the depth-first traversal (ever wondered why DFS is so much simpler than BFS…?), and doing anything that breaks this pattern is going to cause pain. So, what I did consists of three steps:

  1. Transform the tree into a representation that favors explicit, flat levels over parent-child relationships. This required moving the nodes around, but that's not a big deal, because Option provides the take() method, so it's very natural to just remove child nodes from a parent node. This part of the algorithm is in the nodes_to_levels() function.
  2. Once the tree is gone, and we have the flattened level data structure, it's trivial to reverse the order of nodes in every 2nd level.
  3. When we have the levels in the correct shape, we can transform the nodes back to the actual tree structure. For this, we take the nodes of each level, and add them, pairwise, as children to the nodes in the previous level. This requires us to maintain a list of pointers to nodes in the previous level.

If you run the code in the playground, you can see the tree before and after the transformation. For the sake of sparing space, I only paste the two relevant functions here:

impl<T> Node<T> {
    fn invert_alternating_levels(&mut self) {
        let mut map = BTreeMap::new();
        self.nodes_to_levels(&mut map, 0);
        
        // Traverse each level.
        let mut prev_level = vec![self];
        
        for (level, mut nodes) in map {
            // Reverse every 2nd level
            if level % 2 == 0 {
                nodes.reverse();
            }

            // Transform back from flat level to parent-children relationships,
            // by observing that for a perfect binary tree, level N+1 contains
            // pairs of nodes which are the children of the respective nodes
            // at the previous level, level N, and the correspondence is pairwise.
            for (i, child) in nodes.into_iter().enumerate() {
                if i % 2 == 0 {
                    prev_level[i / 2].left = Some(child);
                } else {
                    prev_level[i / 2].right = Some(child);
                }
            }

            // Update `prev_level` so that its new value is an array of pointers
            // to the _children_ we just finished processing.
            prev_level = prev_level
                .into_iter()
                .flat_map(|node| vec![
                    node.left.as_deref_mut().expect("missing left child: imperfect BST"),
                    node.right.as_deref_mut().expect("missing right child: imperfect BST"),
                ])
                .collect();
        }
    }
    
    /// Collects the subtrees into a map of levels: for each level (integer),
    /// add, in the original left to right order, all nodes on that level.
    fn nodes_to_levels(&mut self, map: &mut BTreeMap<usize, Vec<Box<Node<T>>>>, current: usize) {
        // For both subtrees:
        // * recurse into subtrees of children, i.e. fill the levels from bottom up,
        //   while we have ownership of the child
        // * once that is done, we are ready to give up ownership of the child,
        //   adding it to the list representing the _current_ level
        if let Some(mut left) = self.left.take() {
            left.nodes_to_levels(map, current + 1);
            map.entry(current).or_insert_with(Vec::new).push(left);
        }
        if let Some(mut right) = self.right.take() {
            right.nodes_to_levels(map, current + 1);
            map.entry(current).or_insert_with(Vec::new).push(right);
        }
    }
}
9 Likes

Thank you. I learned a lot seeing an alternative approach to what I was working on.

  • Some(ref mut myvar) helps a lot. I couldn't figure out how to do that. Good to know.
  • Generics in rust don't look too difficult, so I didn't bother to mess with them in this example. However, having a base type in the implementation was handy so I could just assign a number to Node.value to experiment with the functions as I was writing them. I noticed with the generic version, I couldn't just assign an integer to Node.value because Rust didn't know how to do the type conversion.
  • Your pretty_print() is much prettier than mine. I was going for quick-n-dirty on that one, plus a chance to experiment BFS and iterators.
  • for _ in 1..level ok. I thought I had tried that before and it didn't like variables when constructing a range. Since then, I've been calling std::ops::range(). Obviously I had done some other kind mistake. Now I'll do this much more simple approach.
  • or_insert_with is wonderful! I don't know how many times I implemented that in python (maybe it is there and I never noticed).
  • You were more purposeful about when you were using refs in situations like if let Some(ref left) = self.left. In situations like that, I had done if let Some(left) = &self.left merely because that is what the compiler suggested.
  • I think the biggest take-away is your use of take(). I guess that is the difference between what you did and my many attempts. It seems to allow you to put all the nodes into a Vec<&mut Node> and loop and index against it.

I guess this shows how much I didn't understand ownership, but that was the reason I picked this exercise in the first place. The exercises and problems I had tried before this hadn't given me much trouble with the borrow-checker so I wanted something that would make me learn as much as possible.

I think one of the issues that gave me a bit of trouble was rust's helpfulness in inferring referencing and dereferencing. There were times when I really didn't know exactly what type I was dealing with (e.g. in from an iterator in a for loop). It might not be Rust idiomatic, but I will still be assigning types in my let statements because I like to see exactly what type I'm dealing with at all times. What I do like about rust, is that If I don't know what type to put into the let statement or return type of a function, I just put i32 and let the compiler correct me. : )

Regarding which solution is safer, the "unsafe" version, or the one you posted, I'm not convinced. Your code does a lot of indexing which is kind of a replacement for pointers in many ways. They both seem prone to error. In many languages, if the current datastructure isn't ideal to the problem, you can overlay another datastructure over the first to solve the problem without needing to destroy the connections already made by the current structure. At a minimum, I'd rather use your algorithm to create a copy of the tree in the new format. I suppose the difference between indexing and unsafe is that a problem with indexes could be caught and handled whereas misuse of a pointer could crash the program or reference memory elsewhere in the program. I've had to debug that in large C programs and it is a headache. However, in this program, the unsafe pointer feel very constrained and my C programmer background has trouble getting too worried about it. Maybe I need a slap on the wrist.

I am grateful for everyone's input!

Update to the unsafe version using iterators rather than indexing:

fn invert_alternating_levels(&mut self) {
    let mut level = 0;
    let mut current_level: Vec<*mut BinTree> = Vec::new();
    let mut next_level: Vec<*mut BinTree> = Vec::new();

    current_level.push(self);
    while current_level.len() > 0 {
        if level % 2 == 1 {
            for node_ptr in current_level.iter()
                            .skip(current_level.len()/2).rev()
                            .zip(current_level.iter()) {
                unsafe {
                    let tmpval = (*(*node_ptr.0)).value;
                    (*(*node_ptr.0)).value = (*(*node_ptr.1)).value;
                    (*(*node_ptr.1)).value = tmpval;
                }
            }
        }

        for node_ptr in current_level.iter() {
            unsafe {
                if let Some(child) = (*(*node_ptr)).left.as_deref_mut() {
                    next_level.push(child);
                }
                if let Some(child) = (*(*node_ptr)).right.as_deref_mut() {
                    next_level.push(child);
                }
            }
        }

        current_level.clear();
        mem::swap(&mut current_level, &mut next_level);

        level += 1;
    }
}

I usually use dict.setdefault() in Python, although that's not as powerful as Rust maps' Entry API.

Unfortunately, this recently-added feature affectionately known as "match ergonomics" (and officially called "default binding modes") conflates values and references badly. In my opinion, it doesn't make sense to match a reference and then act as if it were a non-reference type, then expect a reference to magically appear inside, which is exactly what the compiler accepts (and apparently suggests). The suggestion is probably there because the language team is heavily biased in favor of this feature, which I find very disturbing, and I personally refrain from any use of it. However, from the compiler's point of view, if let Some(left) = &self.left works equally.

That's unfortunate. If you haven't been convinced by this many people telling you that unsafe is really unsafe, and by a concrete implementation without unsafe, then I just can't help. The lack of unsafe guarantees memory safety, even in the presence of operations which may panic (e.g. indexing). But indexing is really a red herring here – I could as well have used .get() with if let, however the problem stated that the binary tree was perfect so if indexing goes wrong, that's a bug in the solution. Yet, if you don't believe this, then we can't prevent you from reaching for unsafe as a first resort even when it's unwarranted and non-idiomatic, and at this point, I'm personally not willing to try any more.

Isn't that what we are trying to avoid in the first place? This is somewhat theoretical and not necessarily related to this example. If we write complicated code to avoid using unsafe(), especially if it has the chance of messing up our datastructures, then I would think that at some point we would be better with a small unsafe section. At some point, there will be a bug in the solution. I am sure this has been debated ad nauseum on this forum, so there is probably no need to go down this rabbit hole here.

Perhaps I should have said "not entirely convinced."

You don't need to. You convinced me much more than you realize. (I've had to debug other people's C code and I'm quite familiar with what happens with bad pointers.) Old habits die hard and it takes time to get used to the idiomatic ways of any new language. I've only been playing with rust for a little while anyway. Besides, at this point I care more about attempting various ways of solving problems. I think my next task will be to take your solution and have it make a new tree rather than modify the current one.

Once again, thank you for your gracious and thorough replies!

1 Like

First, I'd like to congratulate you on this thread. I wish there were a way to mark it as recommended reading for all Rust nauplii (new Crustaceans) on their reveal of the wonders of Rust.

Relative to unsafe: If you are writing code only for your own isolated use, and it will never be exposed to an internal or external network, then there is no overwhelming risk in following your old C coding habits with pointer unsafety. However, if there is any possibility that your code 1) is important, and 2) could be accessible to an attacker even via multiple levels of indirection (think Stuxnet), then you need to avoid your tendency to revert to unsafe coding practices.

The Rust keyword unsafe has its uses, but should not be employed without full understanding of what it does, and more importantly does not, enable. unsafe tells the rustc compiler front-end that you are assuming some of the proof-of-correctness obligations of your code, in particular with respect to potential concurrent mutable access to data. It does not remove any of those "correctness" requirements. If you violate them, the compiler's highly-optimizing backend (e.g., LLVM) is free to compile code that does something other than what you expect, due to your having lied to it that you met all of its input constraints.

A potential attacker who reverse-engineers the actual compiler-produced code can observe the variance in what the code does from what you expected, then potentially exploit that divergence to "hack" your code. Those of us who work in the cybersecurity community understand this only too well. That's why, for many of us, the most important property of Rust code is that, in the absence of use of unsafe (or subtle compiler errors), such hackability is impossible.

5 Likes

To my mind, the point of writing safe code isn't to avoid bugs. Bugs are inevitable. There are certain kinds of buggy behavior (panics, memory leaks) about which Rust explicitly says "yep, that's fine". There are other more subtle bugs that Rust allows (like changing the hash of a value in a HashMap via internal mutability, or writing an asymmetric PartialEq impl) that are clearly bad but still can't lead to memory unsafety or undefined behavior.

What's special about UB is that it attacks your ability to find bugs, like a disease that attacks the immune system. Undefined behavior can have arbitrary, non-local and even non-causal effects that undermine the deterministic nature of programs. That's intolerable, and that's why it's so important that safe Rust rules out undefined behavior even if there are still classes of bugs that it doesn't eliminate.

I also think you are overestimating the complexity of the safe solution compared to the unsafe-using one; or, at least, you're not accounting for the difficulty of proving the unsafe solution correct because it's a cost you're used to paying. In most cases the difficulty of writing correct unsafe code is no more than that of convincing the compiler to accept a safe one, and in a lot of cases the unsafe route turns out to be harder. Of course, it is always easy to write unsafe code that may be wrong.

(Also I believe the program can be further simplified, but that is neither here nor there.)

5 Likes

For the sake of practice, I took H2CO3's example as a model and attempted to write my own as a 'copy-constructor' rather than an in-place solution. I have to admit that with my level of experience with rust, I never would have completed it if I hadn't known that it was possible by seeing with my own eyes. Even looking back and forth between my code and his there were quite a few times I didn't understand why my code wouldn't work. But I guess I got the experience with the borrow-checker I was looking for. : ) I intentionally tried to do approach the problem a bit differently than H2CO3 just so that I could use different constructs and not just blindly copy-and-paste. In the end there was a lot of similarity, which means I never would have been able to solve it without looking at his answer. I know I haven't done professional programming in a few years (I'm a bit 'rusty', ahem..), but as a fairly experienced programmer, I admit that got quite frustrating.

While H2CO3's solution was great for this problem, it bothered me when I thought about it for more general situations. I'm not real comfortable with the idea of taking large datastructures, throwing humpty-dumpty off the wall and putting him back together again just to change a little bit of him. But every language has advantages and disadvantages. I'm still waiting for the perfect programming language.

I also read through the Rustonomicon. Even if you never write unsafe code, it is a great reference to understanding a lot about ownership and lifetimes.

TomP: No, you don't want me writing unsafe code on a production system. At this point, I'm learning as a side-project for fun, but this is the time to get into good habits for the language I'm learning. Besides, in my case, if the problem I'm trying to solve is too difficult for me to do in rust, I can fall back to some other language. These days the most exciting thing I usually do is gen up some quick python to do something against some text files.

Here is my final working version based on H2CO3's solution. I do have one question in the code below, marked QUESTION, if anyone has a chance to take a peek. (full code in playground -- still newbie and experimentation code and does not include all of H2CO3's great suggestions)

// Depth-first-search; add nodes in left-to-right order
// map is (level->(vec<value>)
fn add_node_values_to_map(level: usize, root: &BinTree, map: &mut std::collections::BTreeMap<usize, Vec<i32>>) {
    map.entry(level).or_insert_with(Vec::new).push(root.value);
    if let Some(ref child) = root.left {
        BinTree::add_node_values_to_map(level+1, child, map);
    }
    if let Some(ref child) = root.right {
        BinTree::add_node_values_to_map(level+1, child, map);
    }
}

// If implemented with generics, requires T to have the Copy trait (trait bounds)
//     - T would be copied twice; once into map, once out of map.  For large T, this could be a problem.
// Much thanks to H2CO3
//   https://users.rust-lang.org/t/newbie-learning-how-to-deal-with-the-borrow-checker/40972/11
fn create_inverted_alternating_levels_bintree(& self) -> Result<Self, String> {

    // create a BST map of the values we want
    let mut tree_values: std::collections::BTreeMap<usize, Vec<i32>> =
            std::collections::BTreeMap::new();
    BinTree::add_node_values_to_map(0, self, &mut tree_values);

    let root_value:i32 = *(tree_values.get(&0).ok_or("Error indexing root level of tree")?
                            .get(0).ok_or("Error_indexing root value of tree")?);
    let mut root_node:BinTree = BinTree::new(root_value);

    let mut prev_level: Vec<&mut BinTree> = Vec::new();
    let mut cur_level: Vec<&mut BinTree> = Vec::new();
    prev_level.push(&mut root_node);

    for (level, mut node_values) in tree_values.into_iter().skip(1) {

        // Do some basic checking of tree size
        let prev_level_len = prev_level.len();
        let cur_level_len = node_values.len();
        if cur_level_len != (prev_level_len * 2) {
            return Err(format!("Level {} is size {}; previous level is size {}",
                    level, cur_level_len, prev_level_len));
        }

        // invert alternating levels
        if (level % 2) == 1 {
            // Note: this requires us to use either: iter_mut on tree_values
            // or use into_iter() and make node_values mut
            node_values.reverse();
        }

        // Transform back from flat level to parent-children relationships,
        // by observing that for a perfect binary tree, level N+1 contains
        // pairs of nodes which are the children of the respective nodes
        // at the previous level, level N, and the correspondence is pairwise.
        for (i, n_value) in node_values.into_iter().enumerate() {
            let node:BinTree = BinTree::new(n_value);
            let parent = prev_level.get_mut(i/2).ok_or(
                            format!("Error indexing parent for item {}, level {}", i, level))?;
            if (i % 2) == 0 {
                parent.left = Some(Box::new(node));
//                if let Some(ref mut child) = parent.left {
//                    cur_level.push(child);
//                }
                // QUESTION
                // What I don't understand in why I can't add to cur_level here
                // and even worse, when I uncomment the code, all the errors are
                // about prev_level.
            } else {
                parent.right = Some(Box::new(node));
            }
        }

        // Now, take ownership of prev_level and loop through it.
        // Move children of the previous level (i.e. the current level)
        // into cur_level.
        for prev in prev_level.drain(..) {
                                 // Use drain instead of into_iter()
                                 // This was one of my ah-ha moments in rust.
                                 // The compiler messages kept pointing me in
                                 // a different direction until I finally understood.
            if let Some(ref mut child) = prev.left {
                cur_level.push(child);
            }
            if let Some(ref mut child) = prev.right {
                cur_level.push(child);
            }
        }

        // Next iteration; next level
        //prev_level = Vec::new(); // prev_level is empty;  Realloc a new vector.
        mem::swap(&mut prev_level, &mut cur_level);
    }

    return Ok(root_node);
}

For the sake of learning, what issues do you see in this unsafe code? Obviously, as we established, there is a better way than using unsafe. I loop twice to use the safety bounds established by iterators. The pointers point to data whose lifetimes are longer than the function. The first issue I see is that if Generics were introduced instead of just using i32 as the type of value, then I as a newbie would not be certain of how the Copy trait functions to be certain of the safety of arbitrary types calling Copy. Also, The book states unsafe code should follow LLVM’s scoped noalias rules. So it appears (which it looks like I did) it is best to not make any changes to the tree using access to the self parameter so long as the pointers are in scope (or better yet, for the scope of the function) (or be extremely careful).

#[allow(dead_code)]
#[deprecated]
fn invert_alternating_levels_useingunsafe(&mut self) {
    let mut level = 0;
    let mut current_level: Vec<*mut BinTree> = Vec::new();
    let mut next_level: Vec<*mut BinTree> = Vec::new();

    current_level.push(self);
    while current_level.len() > 0 {
        if level % 2 == 1 {
            for node_ptr in current_level.iter()
                            .skip(current_level.len()/2).rev()
                            .zip(current_level.iter()) {
                unsafe {
                    let tmpval = (*(*node_ptr.0)).value;
                    (*(*node_ptr.0)).value = (*(*node_ptr.1)).value;
                    (*(*node_ptr.1)).value = tmpval;
                }
            }
        }

        for node_ptr in current_level.iter() {
            unsafe {
                if let Some(child) = (*(*node_ptr)).left.as_deref_mut() {
                    next_level.push(child);
                }
                if let Some(child) = (*(*node_ptr)).right.as_deref_mut() {
                    next_level.push(child);
                }
            }
        }

        current_level.clear();
        mem::swap(&mut current_level, &mut next_level);

        level += 1;
    }
}

I had a question in my last reply, but it was buried in the middle. I still have a question on how the borrow checker works. See the code below, and the section marked "QUESTION". (full code in playground -- still newbie and experimentation code and does not include all of H2CO3's great suggestions)

fn create_inverted_alternating_levels_bintree(& self) -> Result<Self, String> {

    // create a BST map of the values we want
    let mut tree_values: std::collections::BTreeMap<usize, Vec<i32>> =
            std::collections::BTreeMap::new();
    BinTree::add_node_values_to_map(0, self, &mut tree_values);

    let root_value:i32 = *(tree_values.get(&0).ok_or("Error indexing root level of tree")?
                            .get(0).ok_or("Error_indexing root value of tree")?);
    let mut root_node:BinTree = BinTree::new(root_value);

    let mut prev_level: Vec<&mut BinTree> = Vec::new();
    let mut cur_level: Vec<&mut BinTree> = Vec::new();
    prev_level.push(&mut root_node);

    for (level, mut node_values) in tree_values.into_iter().skip(1) {

        // Do some basic checking of tree size
        let prev_level_len = prev_level.len();
        let cur_level_len = node_values.len();
        if cur_level_len != (prev_level_len * 2) {
            return Err(format!("Level {} is size {}; previous level is size {}",
                    level, cur_level_len, prev_level_len));
        }

        // invert alternating levels
        if (level % 2) == 1 {
            // Note: this requires us to use either: iter_mut on tree_values
            // or use into_iter() and make node_values mut
            node_values.reverse();
        }

        // Transform back from flat level to parent-children relationships,
        // by observing that for a perfect binary tree, level N+1 contains
        // pairs of nodes which are the children of the respective nodes
        // at the previous level, level N, and the correspondence is pairwise.
        for (i, n_value) in node_values.into_iter().enumerate() {
            let node:BinTree = BinTree::new(n_value);
            let parent = prev_level.get_mut(i/2).ok_or(
                            format!("Error indexing parent for item {}, level {}", i, level))?;
            if (i % 2) == 0 {
                parent.left = Some(Box::new(node));
//                    if let Some(ref mut child) = parent.left {
//                        cur_level.push(child);
//                    }
                // QUESTION
                // What I don't understand in why I can't add to cur_level here
                // and even worse, when I uncomment the code, all the errors are
                // about prev_level.
            } else {
                parent.right = Some(Box::new(node));
            }
        }

That's the for loop in action. It can seem a bit counter-intuitive, but the key detail to remember is that the beginning of the loop body on the next iteration occurs after the end of the loop body on the previous iteration.

So, as a first-order approximation, if you borrow something at the end of the loop, and you store the borrow and re-use it at the beginning, then the compiler has to assume the value is borrowed even at the beginning of the body, because the borrow persists across iterations.

Specifically, what happens here is:

  • on iteration N, you borrow parent:

    if let Some(ref mut child) = parent.left {
        cur_level.push(child);
    }
    
  • But this borrows prev_level, too, because it points inside an element thereof:

    let parent = prev_level.get_mut(i/2).ok_or(…)?;
    
  • But then during iteration N + 1, i.e. later with regards to control flow (even if lexically earlier), you try to immutably borrow prev_level by taking its length:

    let prev_level_len = prev_level.len();
    

    while the previous borrow is still active only because you stored the resulting reference someplace that persists across iterations, in this case, the cur_level vector.

Actually, the compiler error explicitly tells you just that:

128 |                 let parent = prev_level.get_mut(i/2).ok_or(
    |                              ^^^^^^^^^^ mutable borrow starts here in previous iteration of loop

The compiler probably doesn't mention cur_level because of a "bad" (or, rather, unintuitive-in-the-presence-of-loops-and-complicated-variables) heuristic: I'm no expert with the specifics of rustc's error reporting, but I'm assuming that it tries to trace back the borrows as early as possible, based on something (maybe CFG analysis, maybe lexical things, I don't know). So, it traces back everything to the point of the borrow, instead of the storage, whereas the immediately-obvious culprit at a higher level is that the reference is stored across iterations. However, since the guilty storage, cur_level is technically not used in any invalid way (it just stores a bunch of references!), it doesn't get reported, and what remains is the double borrow of the prev_level -> parent -> child cycle of pointers.


By the way, note how my solution differs from yours in this regard. I only use a single vector that I completely recreate upon every iteration (based on pointers derived solely from its own contents), so there's no chance of ever trying to store spuriously-surviving mutable borrows anywhere else.

1 Like

Ok, Thanks! That helps a lot.

I think there could be some good rustlings problems based on that.

I don't blame the compiler for not being able to point to the exact problem. Your explanation makes sense as to why it said what it did. Perhaps the most impressive thing about Rust to me so far is how helpful the compiler error messages have been.

I notice your solution and muttered several curses as I was reading it. : ) I don't think I've ever seen anything like creating a vector from an iterator. Even if I had noticed the impl FromIter for Vec, I never would have thought of a reason of why it would be useful. I think I'm going to need to keep seeing other people's code to get a good exposure to some different rust-like coding styles to see some new solutions to problems.

I also think it would be nice if there was a documentation page that listed some of the "tricks" that help you when you have issues with the borrow checker. E.g. Option seems to allow you to do things you can't normally do (I think), std::mem::swap(), vec.split_mut(), your trick with the FromIter, etc. Or maybe someone already has a nice blog out there I haven't found yet.

I accidentally left out the end of my function in my last post. I think it does the same thing as your FromIter. For the practice, I tried to write the code without looking at yours (but I still ended up needing to reference yours for the algorithm):

        // Now, take ownership of prev_level and loop through it.  
        // Move children of the previous level (i.e. the current level)
        // into cur_level.
        for prev in prev_level.drain(..) {
                                 // Use drain instead of into_iter()
                                 // This was one of my ah-ha moments in rust.
                                 // The compiler messages kept pointing me in
                                 // a different direction until I finally understood.
            if let Some(ref mut child) = prev.left {
                cur_level.push(child);
            }
            if let Some(ref mut child) = prev.right {
                cur_level.push(child);
            }
        }

        // Next iteration; next level
        mem::swap(&mut prev_level, &mut cur_level);
    }

    return Ok(root_node);
}