Iterating over my BTreeMap

Ok. So to summarise the situation ( as I understand it ):

(1) Implementing everything to completely mimic std:collections::BTreeMap without using unsafe is going to be somewhat tricky, but may be possible, with quite a lot of code.

(2) My simple unsafe scheme ( get_ix_mut ), although it "works to some extent" (appears to run ok without segfaults on real machines, doesn't have any data races exactly) is not sound, in that it has aliased mutable references, which break on the abstract machine, and Miri is unhappy. I don't know if it can be fixed, but apparently not. It also isn't all that efficient.

My thought at the moment is not to bother with implementing Iterator at all, as I don't need it and I think it is pretty clear it would be inefficient compared to simply walking the tree recursively calling a closure, which is simple and works fine.

[ I just did some benchmarking, and my experimental BTree seems to be about 20% faster than std::collections::BTreeMap in a test that uses it fairly intensively, Likely because it is walking with a closure rather than iterating, although it could be for some other reason I guess ]

After a bit of struggle ( I am getting old! ) I think I got the DoubleEndedIterator working. The crucial code:

impl<'a, K, V> Iterator for IterNonLeafMut<'a, K, V> {
    type Item = (&'a mut K, &'a mut V);
    fn next(&mut self) -> Option<Self::Item> {
        if let Some(x) = self.current.next() {
            return Some(x);
        }

        if let Some(&mut (ref mut k, ref mut v)) = self.v.next() {
            if let Some(next) = self.c.next() {
                self.current = next.iter_mut();
            }
            Some((k, v))
        } else {
            self.current_back.next()
        }
    }
}

I couldn't get my head around how to get the stack version to work though. But the above is ok!

1 Like

I finally managed to implement the stack version with DoubleEnded iteration. The way I am thinking about it is the forward iteration has to "steal" stuff from the backward iteration ( and vice-versa of course ). There are two stacks. Code ( full code here ):

struct Stk<'a, K, V> {
    v: std::slice::Iter<'a, (K, V)>,
    c: std::slice::Iter<'a, Tree<K, V>>,
}

enum StealResult<'a, K, V> {
    Value((&'a K, &'a V)),
    Child(&'a Tree<K, V>),
    Nothing,
}

/// Iterator returned by [BTreeMap::iter], [BTreeMap::range].
pub struct Iter<'a, K, V> {
    fwd_leaf: Option<IterLeaf<'a, K, V>>,
    bck_leaf: Option<IterLeaf<'a, K, V>>,
    fwd_stk: StkVec<'a, K, V>,
    bck_stk: StkVec<'a, K, V>,
}
impl<'a, K, V> Iter<'a, K, V> {
    fn new() -> Self {
        Self {
            fwd_leaf: None,
            bck_leaf: None,
            fwd_stk: StkVec::new(),
            bck_stk: StkVec::new(),
        }
    }
    fn push_tree(&mut self, tree: &'a Tree<K, V>, both: bool) {
        match tree {
            Tree::L(x) => {
                self.fwd_leaf = Some(x.iter());
            }
            Tree::NL(x) => {
                let (v, mut c) = (x.v.iter(), x.c.iter());
                let child = c.next();
                let child_back = if both { c.next_back() } else { None };
                let both = both && child_back.is_none();
                self.fwd_stk.push(Stk { v, c });
                if let Some(child) = child {
                    self.push_tree(child, both);
                }
                if let Some(child_back) = child_back {
                    self.push_tree_back(child_back);
                }
            }
        }
    }
    fn push_range<T, R>(&mut self, tree: &'a Tree<K, V>, range: &R, both: bool)
    where
        T: Ord + ?Sized,
        K: Borrow<T> + Ord,
        R: RangeBounds<T>,
    {
        match tree {
            Tree::L(leaf) => {
                let (x, y) = leaf.get_xy(range);
                self.fwd_leaf = Some(IterLeaf(leaf.0[x..y].iter()));
            }
            Tree::NL(t) => {
                let (x, y) = t.get_xy(range);
                let (v, mut c) = (t.v[x..y].iter(), t.c[x..y + 1].iter());

                let child = c.next();
                let child_back = if both { c.next_back() } else { None };
                let both = both && child_back.is_none();

                self.fwd_stk.push(Stk { v, c });
                if let Some(child) = child {
                    self.push_range(child, range, both);
                }
                if let Some(child_back) = child_back {
                    self.push_range_back(child_back, range);
                }
            }
        }
    }
    fn push_range_back<T, R>(&mut self, tree: &'a Tree<K, V>, range: &R)
    where
        T: Ord + ?Sized,
        K: Borrow<T> + Ord,
        R: RangeBounds<T>,
    {
        match tree {
            Tree::L(leaf) => {
                let (x, y) = leaf.get_xy(range);
                self.bck_leaf = Some(IterLeaf(leaf.0[x..y].iter()));
            }
            Tree::NL(t) => {
                let (x, y) = t.get_xy(range);
                let (v, mut c) = (t.v[x..y].iter(), t.c[x..y + 1].iter());
                let child_back = c.next_back();
                self.bck_stk.push(Stk { v, c });
                if let Some(child_back) = child_back {
                    self.push_range_back(child_back, range);
                }
            }
        }
    }
    fn push_tree_back(&mut self, tree: &'a Tree<K, V>) {
        match tree {
            Tree::L(x) => {
                self.bck_leaf = Some(x.iter());
            }
            Tree::NL(x) => {
                let (v, mut c) = (x.v.iter(), x.c.iter());
                let child_back = c.next_back();
                self.bck_stk.push(Stk { v, c });
                if let Some(child_back) = child_back {
                    self.push_tree_back(child_back);
                }
            }
        }
    }
    fn steal_bck(&mut self) -> StealResult<'a, K, V> {
        for s in self.bck_stk.iter_mut() {
            if s.v.len() > s.c.len() {
                if let Some(kv) = s.v.next() {
                    return StealResult::Value((&kv.0, &kv.1));
                } else {
                    panic!()
                }
            } else if let Some(child) = s.c.next() {
                return StealResult::Child(child);
            }
        }
        StealResult::Nothing
    }
    fn steal_fwd(&mut self) -> StealResult<'a, K, V> {
        for s in self.fwd_stk.iter_mut() {
            if s.v.len() > s.c.len() {
                if let Some(kv) = s.v.next_back() {
                    return StealResult::Value((&kv.0, &kv.1));
                } else {
                    panic!()
                }
            } else if let Some(child) = s.c.next_back() {
                return StealResult::Child(child);
            }
        }
        StealResult::Nothing
    }
}
impl<'a, K, V> Iterator for Iter<'a, K, V> {
    type Item = (&'a K, &'a V);
    fn next(&mut self) -> Option<Self::Item> {
        loop {
            if let Some(f) = &mut self.fwd_leaf {
                if let Some(x) = f.next() {
                    return Some(x);
                } else {
                    self.fwd_leaf = None;
                }
            } else if let Some(s) = self.fwd_stk.last_mut() {
                if let Some(kv) = s.v.next() {
                    if let Some(child) = s.c.next() {
                        self.push_tree(child, false);
                    }
                    return Some((&kv.0, &kv.1));
                } else {
                    self.fwd_stk.pop();
                }
            } else {
                match self.steal_bck() {
                    StealResult::Value(v) => {
                        return Some(v);
                    }
                    StealResult::Child(c) => {
                        self.push_tree(c, false);
                    }
                    StealResult::Nothing => {
                        if let Some(f) = &mut self.bck_leaf {
                            if let Some(x) = f.next() {
                                return Some(x);
                            } else {
                                self.bck_leaf = None;
                                return None;
                            }
                        } else {
                            return None;
                        }
                    }
                }
            }
        }
    }
}
impl<'a, K, V> DoubleEndedIterator for Iter<'a, K, V> {
    fn next_back(&mut self) -> Option<Self::Item> {
        loop {
            if let Some(f) = &mut self.bck_leaf {
                if let Some(x) = f.next_back() {
                    return Some(x);
                } else {
                    self.bck_leaf = None;
                }
            } else if let Some(s) = self.bck_stk.last_mut() {
                if let Some(kv) = s.v.next_back() {
                    if let Some(child) = s.c.next_back() {
                        self.push_tree_back(child);
                    }
                    return Some((&kv.0, &kv.1));
                } else {
                    self.bck_stk.pop();
                }
            } else {
                match self.steal_fwd() {
                    StealResult::Value(v) => {
                        return Some(v);
                    }
                    StealResult::Child(c) => {
                        self.push_tree_back(c);
                    }
                    StealResult::Nothing => {
                        if let Some(f) = &mut self.fwd_leaf {
                            if let Some(x) = f.next_back() {
                                return Some(x);
                            } else {
                                self.fwd_leaf = None;
                                return None;
                            }
                        } else {
                            return None;
                        }
                    }
                }
            }
        }
    }
}
impl<'a, K, V> FusedIterator for Iter<'a, K, V> {}

Incidentally, the performance seems to be quite good, about 25% faster than std::collections::BTreeMap on a simple iteration test, although such tests should always be taken with a huge pinch of salt! Whereas the Box version was much slower, as each call to next was O(log N) rather than O(1) (amortised) where N is the number of key-value pairs stored in the map. Also, this version doesn't need any heap allocations to iterate as it uses smallvec crate for the stack vecs.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.