On-Disk B+ Tree


#1

I’m very new to Rust (written Hello World, that’s about it), but am an experienced programmer in C/C++, Java, and Python. I’m looking to create an on-disk B+ tree library. I’m struggling with the layout of the Node struct that will represent a Node on disk. I will really have 2 types of nodes: InternalNode and LeafNode. However, I think I can overload them by simply having the following:

enum Payload<V> {
        Value(V),
        Children([u64; 32]),
    }

struct Node<K: Ord, V> {
    key: K,
    parent: u64,
    payload: Payload<V>,
}

First, is overloading a field like this a good/smart idea? I’m going to need a way to tell the type of a node after I’ve read it off… thoughts on that?

Overall, does it seem like I’m heading in the right direction here? Critiques? Pointers? Thanks!


#2

I’m a beginner, too, but if you are interested in handling data types with rust, maybe you are interested on this tutorial:

http://cglab.ca/~abeinges/blah/too-many-lists/book/


#3

I think it is important to store the keys of the children in the parent node, otherwise you would have to read all child nodes to find out where to go next during search.

Nesting an enum inside a struct is perfectly OK though.


#4

@matklad good point! So I’d need to make my Children type something like: [(K, u64); 32].

As a slight aside, I had tried to make the number of children a constant: const NUM_CHILDREN: u32 = 32; but then I couldn’t use the NUM_CHILDREN value in the array definition. Thoughts on why that’s not possible?


#5

It is possible, though you’ll need to use usize type:

const N: usize = 32;

fn main() {
    let _ = [0; N];
}

#6

Right. With Rust it’s important to actually read what the error message is exactly saying :slight_smile:


#7

@matklad aah, OK. @leonardo the error is a bit mis-leading:

There was an error while evaluating the expression for the length of a fixed-
size array type.

Some examples of this error are:

// divide by zero in the length expression
const A: [u32; 1/0] = [];

// Rust currently will not evaluate the function `foo` at compile time
fn foo() -> usize { 12 }
const B: [u32; foo()] = [];

// it is an error to try to add `u8` and `f64`
use std::{f64, u8};
const C: [u32; u8::MAX + f64::EPSILON] = [];

#8

OK… still struggling. I might be making things too complex here. I’d like to read fix sized records from disk, and I’d like to leverage generics to properly type them when returned to the caller. However, these two are at odds because by definition, you don’t know the size of the generic. So instead I ask the caller to provide the size of K and V, but Rust is still unhappy about trying to transmute one into the other:

    pub fn new(keySize: usize, valueSize: usize) -> Result<BTree<K,V>, Box<Error>> {
        let file = try!(OpenOptions::new()
                                  .read(true)
                                  .write(true)
                                  .create(true)
                                  .open("btree.dat"));

        let totalSize = keySize +
                        size_of::<u64>() +
                        max(valueSize, (keySize+size_of::<u64>()) * NUM_CHILDREN);
        let mut buff = Vec::with_capacity(totalSize);
        file.read_exact(&mut buff);

        let root_node: Node<K,V> = unsafe { transmute(buff) };

        BTree{ fd: file, first: 0, root: root_node };
    }

I get the following error:

src/lib.rs:39:45: 39:54 error: transmute called with differently sized types: std::vec::Vec<u8> (192 bits) to Node<K, V> (size can vary because of K) [E0512]
src/lib.rs:39         let root_node: Node<K,V> = unsafe { transmute(buff) };
                                                          ^~~~~~~~~

I think I understand what it’s saying. It wants to make sure that buff and Node<K,V> are the same size before haphazardly converting one to the other, but it’s unsure of the size of Node<K,V> (though I also suspect the size of Vec<u8> isn’t what I really want).

Thoughts on how to elegantly accomplish what I’m trying to do? Read records from disk, but present them back to the user as Node<K,V>?

Edit: Full code if it’s helpful: https://github.com/wspeirs/btree/blob/master/src/lib.rs


#9

I don’t think that reading records via transmute is a good idea, and it’s actually nice that the compiler prevents you from doing so :slight_smile:

  • the format will not be compatible across different version of the compiler, operating system, CPU, etc (think little vs big endian, for example)

  • You can easily introduce memory unsafety if you accidentally roundtrip a structure with a pointer, like struct S { foo: Box<i32> }. Presumably the box will be written as a numeric pointer value, and on reading this back you’ll get a dangling pointer.

  • Without any data validation, you would be able to get invalid values back. That is, if you have enum E {A, B} and E::A and E::B are represented as 1 and 2, you could accidentally try to transmute 92 into E.

Here is an example of the last problem in python:

Python 3.4.5 (default, Jun 25 2016, 21:52:34) 
[GCC 5.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> x = bool.from_bytes(b'\2', 'big')
>>> isinstance(x, bool)
True
>>> x == True
False
>>> x == False
Falsey
>>> xuse
False
>>> 1 if x else 2
1
>>> 

I would say that the proper solution here is to use real binary serialization. I haven’t done this in Rust myself, but I think this should be build on top of serde ( https://github.com/TyOverby/bincode perhaps?)

But it looks like you are more keen to do this by hand! In that case, I would suggest looking into https://github.com/BurntSushi/byteorder and implementing encoding for standard types manually. You won’t be able to implement something derivable, like serde does, but a (macro) DSL for specifying serialization format per type should be doable.


#10

You probably don’t really want to transmute a Vec (a struct containing a pointer, length, and capacity) to a node.

If you transmute the buffer itself (&buff[0], of type &u8) to &Node<whatever> and then dereference that, it should work. (Rust does not have C-like strict aliasing rules.)

Also, you should mark the structs you’re writing to disk as #[repr(C)] for a better guarantee of layout consistency across compiler versions, as Rust does not currently define a stable ABI.


#11

@matklad I’m not looking to do this by hand, so thank you very much for the pointer to bincode!!! I also started looking at msgpack as they seem to have a Rust implementation: https://github.com/mneumann/rust-msgpack

@comex thanks for the #[repr(C)] pointer…


#12

If you need ideas, I wrote a Generalized Search Tree here which implements an RTree. It uses the same enum that you’ve chosen to use.


#13

Just saying that I have been thinking about the same problem, so I am very interested in what you are doing. In particular I wonder what the right approach would be to keep the in-memory tree and the on-disk tree work nicely together. Definitely you won’t read the whole thing and re-write it. I have starred the repo and I’ll be reading your code :wink: