Improving efficiency of Entry API for my Btree implementation

The Entry API for BTreeMap returns an Entry struct. Currently I have this:

pub enum Entry<'a, K, V> {
    /// Vacant entry - map doesn't yet contain key.
    Vacant(VacantEntry<'a, K, V>),
    /// Occupied entry - map already contains key.
    Occupied(OccupiedEntry<'a, K, V>),
}

and

pub struct OccupiedEntry<'a, K, V> {
    map: &'a mut BTreeMap<K, V>,
    key: OccupiedEntryKey,
}

and

enum OccupiedEntryKey {
    First,
    Last,
    Some(Position),
}

and

struct Position {
    key_found: bool,
    leaf_full: bool,
    ix: PosVec,

and

type PosVec = smallvec::SmallVec<[u8; 8]>;

( All taken from lib.rs - source)

So here PosVec is used to relatively quickly find a place in the BTree using indexes. However it would be more efficient to store some kind of pointer to the Leaf. However I cannot store both a mutable reference to the whole BTreeMap and a mutable reference to the Leaf in the Entry, Rust's rules do not permit it ( correct me if I am wrong ). So what could I do instead, and still have sound code?

Maybe I can use two raw pointers, and use those to get mutable references as required. For example if OccupiedEntry::Remove is called, I can (efficiently) remove the (K,V) pair from the Leaf and then decrement the len field in BTreeMap, and the operation is complete. Would that be correct?

[ There are actually two cases, some (K,V) pairs are stored in non-Leaf structs, but I will ignore that here for simplicity ]

Note: the whole point here is to better understand what is allowed with raw pointers and what is not, how they can be used correctly.

The problem goes back to the instant/unconditional UB that would be caused by converting the raw pointers to mutable refs, such that the single ownership rule is violated. If we break that rule, the compiler isn't designed to produce sound results. I believe this is the sort of thing that will cause Miri to fail, or at least it did for me in similar circumstances.

1 Like

If the only reason to have a mutable reference to the root is to adjust the length, you could keep that as &mut usize instead, and an &mut Tree that points to the internal node which contains the key.

1 Like

It is a bit more complicated, it depends what methods of Entry are called, and also whether the Leaf ( or possibly NonLeaf ) node where the (K,V) pair is stored is full. If it is full, then "more complicated" things need to happen (the node needs to be split, and parent node updated etc), if the operation is Insert.

In the simplest case (and probably most common), the caller will typically just be given a mutable reference to the (K,V) pair (for an occupied Entry), and the mutable reference to the whole map is not needed at all.

I wonder if I only convert one of them at a time, if that is ok per the rules. So, for example, convert the one for the whole map to update the len, then (after dropping that mutable ref), convert the one to the Leaf node and do the Remove.

That can work, but you need to be careful to not invalidate any of the raw pointers during the construction of the others. I’m not too familiar with the exact rules but they go something like this:

  • You can keep pointers to both a parent/ancestor node and a child node at the same time, as long as the child pointer was created via the parent pointer that you’re storing.
  • Working with the child pointer doesn’t affect the parent pointer’s validity
  • Dereferencing the parent pointer invalidates the child pointer, meaning that future dereferences of the child pointer are UB

This generally mimics the semantics of &mut reborrowing, but you’re taking over the responsibility to check sequencing from the borrow checker.


You’ll need to do these operations in the other order, I think, because upgrading the root pointer to &mut will make it UB to upgrade the leaf pointer afterwards.

1 Like

Thanks! I am going to give it a try. Will probably post something later.

Well, I tried it, and it seems to work. Miri is not complaining. Here the code for VacantEntry::insert:

    /// Insert value into map returning reference to inserted value.
    pub fn insert(mut self, value: V) -> &'a mut V {
        if self.pos.leaf_has_space {
            let result = match self.pos.ptr {
                TreePtr::L(ptr, ix) => unsafe {
                    let x = &mut (*ptr).0;
                    x.insert(ix, (self.key, value));
                    &mut x[ix].1
                },
                _ => panic!(),
            };
            self.map.len += 1;
            result
        } else {
            &mut self.map.ins_pos(&mut self.pos, self.key, value).1
        }
    }

Code for OccupiedEntry::remove_entry:

    pub fn remove_entry(self) -> (K, V) {
        match &self.key {
            OccupiedEntryKey::Some(pos) => // self.map._remove_pos(pos),
            {
                let result = match pos.ptr {
                   TreePtr::L(ptr,ix) => unsafe { (*ptr).0.remove(ix) }
                   TreePtr::NL(ptr,ix) => unsafe { (*ptr).remove_at(ix) }
                   TreePtr::None => panic!()
                };
                self.map.len -= 1;
                result
            }
            OccupiedEntryKey::First => self.map.pop_first().unwrap(),
            OccupiedEntryKey::Last => self.map.pop_last().unwrap(),
        }
    }

Full source here

I wish I could say I am 100% sure this is sound, but really I am not sure at all.

Edit: I did some crude benchmarking, and it appears that my BTreeMap is now significantly faster than std::collections::BTreeMap for insertions using Entry API, which was a bit unexpected. I am getting fairly consistent times of 0.88 seconds versus 1.25 seconds. That is for a test of 100,000 insertions, repeated 100 times, inserting pair of usize,usize:

        for i in 0..n {
            t.entry(i).or_insert(i);
        }

( I think this is perhaps due to using a relatively large size for the leaf vec, I am using 39, I forget what the std BTreeMap uses, maybe as little as 5, so it needs to do more allocations. It also helps keep the tree shallow ( not so many levels ) )

It turns out miri IS complaining ( I had just failed to trigger it correctly ).

The test:

#[test]
fn is_this_ub_test()
{
    BTreeMap::new().entry(0).or_insert('a');
}

Miri output:

Preparing a sysroot for Miri (target: x86_64-pc-windows-msvc)... done
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.02s
     Running unittests src\lib.rs (target\miri\x86_64-pc-windows-msvc\debug\deps\btree_experiment-ba16e1da17f5b871.exe)

running 1 test
error: Undefined Behavior: trying to retag from <323007> for Unique permission at alloc119945[0x8], but that tag does not exist in the borrow stack for this location
    --> src\lib.rs:775:25
     |
775  |                 let x = &mut (*ptr).0;
     |                         ^^^^^^^^^^^^^
     |                         |
     |                         trying to retag from <323007> for Unique permission at alloc119945[0x8], but that tag does not exist in the borrow stack for this location
     |                         this error occurs as part of retag at alloc119945[0x8..0x18]
     |
     = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
     = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <323007> was created by a SharedReadWrite retag at offsets [0x8..0x18]
    --> src\lib.rs:1164:34
     |
1164 |             pos.ptr = TreePtr::L(self, i);
     |                                  ^^^^
help: <323007> was later invalidated at offsets [0x0..0x28] by a Unique retag
    --> src\lib.rs:120:22
     |
120  |                 map: self,
     |                      ^^^^
     = note: BACKTRACE (of the first span) on thread `is_this_ub_test`:
     = note: inside `VacantEntry::<'_, i32, char>::insert` at src\lib.rs:775:25: 775:38
note: inside `Entry::<'_, i32, char>::or_insert`
    --> src\lib.rs:656:33
     |
656  |             Entry::Vacant(e) => e.insert(value),
     |                                 ^^^^^^^^^^^^^^^
note: inside `is_this_ub_test`
    --> src\lib.rs:2106:5
     |
2106 |     BTreeMap::new().entry(0).or_insert('a');
     |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: inside closure
    --> src\lib.rs:2104:21
     |
2103 | #[test]
     | ------- in this procedural macro expansion
2104 | fn is_this_ub_test()
     |                     ^
     = note: this error originates in the attribute macro `test` (in Nightly builds, run with -Z macro-backtrace for more info)

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

The question now is whether it is a false positive or not, and if it is UB, is there a fix? I am way out of my depth here, I really don't understand this very well. I mean, I can see how there is an overlap of a mutable references, that is what is needed. I am trying to avoid re-computing the mutable reference to the leaf by storing a raw pointer to it. Umm... I guess I could try using a raw pointer for the map mutable reference. Does that make any difference?

[ Overall, it feels like the "rules" are stopping me from implementing the Entry API efficiently, unless I am mis-interpreting them. This seems "wrong" in some sense, I expect a systems programming language to be able to implement efficient algorithms. That said, the code is working just fine in practice, but there is some kind of uncertainty hanging over it. is this a valid use of unsafe / raw pointer, or not? ]

Hmm, so I had a look at what the std library is doing, and found this:

I experimented with it a bit, but got rather confused. The complications seem endless.

If I’m reading that Miri error correctly, the source of the problem that it’s detecting is that inside your entry types, you’re storing map as &mut instead of *mut and the move of self into map is invalidating any raw pointers that were created before that point.

This error should be fixable by changing the type of the map field and adjusting entry() to create that raw pointer first, and then use it instead of self to calculate the position. It’s entirely possible that doing this will reveal a problem somewhere else in the code, though— I’m a little out of my depth here.

Yes, because the optimizer is allowed a lot more leeway to change code involving &mut than *mut due to its stricter usage rules.

3 Likes

Oh wow, that seems to have done the trick. Miri is now happy. Amazing.

Code is here:

[ I don't understand PhantomData much yet, but compiler seems to demand it! ]

Very roughly, PhantomData<T> tells the compiler that, while you don't actually have a T in your structure, you want the compiler to apply all the restrictions it would apply to your structure if you did have a T in it.

In this case, you don't have a &'a mut BTreeMap<K, V>, and thus the compiler has nothing that restricts the lifetime 'a. By putting in a PhantomData<&'a mut BTreeMap<K, V>>, you tell the compiler that one of your other fields is being used as a disguised &'a mut BTreeMap<K, V>, and it should restrict 'a, K, V as-if you had &'a mut BTreeMap<K, V> in there.

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.