How to parse an mmaped file and present references into it?


#1

I’d like to present a nice API into a memory-mapped binary file. My wrapper will start by parsing the &[u8] from the mmap. The file has fields which are offsets into other parts of the file, etc. My code will have to validate the file (e.g. ensure that offsets are valid) and then present the nice wrapper on it. It starts looking like this:

pub struct MyWrapper {
    file: memmap::Mmap,
    some_blob: &[u8],  // which lifetime? read on
    things: Vec<Thing>, // build a cache of things found inside the file
}

struct Thing {
    some_parsed_value: i32,
    blob: &[u8], // read on
}

impl MyWrapper {
    pub fn new(file: memmap:Mmap) -> Result<MyWrapper, MyError> {
        // validate the contents of the file,
        // build up our self.things array of Things with references to blobs
    }

    pub fn get_some_blob(&self) -> &[u8] {
        self.some_blob
    }

    pub fn iter_things(&self) -> ... {
        self.things.iter() // or something
    }
}

What’s the strategy here to deal with the lifetimes of the little blobs inside the mmaped data? For self-referential structs I’ve read a bit about the rental crate and owning_ref, but it’s not clear to me how to structure my code around them.

Does any one know examples of similar things being done elsewhere? Thanks!


#2

You can replace the self referencing slices with a custom struct that stores the slice components (i.e. start, end) and then use it to return slices on demand, such as from the get_some_blob() method. Perhaps using std::ops::Range<usize> as the type would work as well.

Alternatively, MyWrapper serves the role of simply keeping the mmap alive and instead you return structs off it that contain references into the internal memory mapping - MyWrapper or something else would store the required offsets and len of the slices.


#3

I would write it like fn new(file: &'a Mmap) -> Result<MyWrapper<'a>>, thus tying lifetime of the struct to Mmap instance. An alternative approach is to make MyWrapper lifetimeless and instead of my_blob to store indices to the begging and end of the blob, also you’ll return iterator type with lifetime tied to MyWrapper, i.e. fn iter_things(&'b self) -> MyIterator<'b>. You can see this approach in action in rosbag crate.


#4

Thanks - I’m now investigating the bytes crate.