Mem layout of structure -> &[u8] with 0 copying

Suppose we have a memory maped file. This is basically a *const u8.

Now, as long as things are aligned, it is straight forward to convert i32 / u32 / f32 / i64 / u64 / f64 in the file, in a 0-copying mannger, to a corresponding &i32, &u32, ...

Question: Is there a way to store a &[u8] in a memory mapped file so that we can in a 0-copying manner get a &[u8] ?

Here by 0-copying, I really mean O(1) memory copying, where it's fine if we create some structure but I don't want to have to make a copy of the entire &[u8].

First, a small bit of terminology fixing, you can’t usefully store any reference &... inside a file, because a reference must contain the memory address of its target. That changes between program executions, and so when you load the file again the old memory address could be anything.


What you’re talking about is treating a portion of the memory mapped file as a slice [u8] and taking a reference to it. This is certainly possible, but will depend on the details of the API that you’re using to memory map the file. Assuming it gives you a raw pointer *mut u8, you can convert that into a mutable slice reference &mut [u8] via slice::from_raw_parts_mut. All you need to tell it is a pointer to the first element and the number of elements in the slice.

To generate a shared reference &[u8], there’s a corresponding function from_raw_parts.

1 Like

Thanks, you're absolutely right, my terminology above was inaccurate.

Inside of the memory mapped file, I want to store an contiguous block of u8's. Then, I want to be able to treat this block of u8's as a &[u8], without making a copy of it, which your recommendation of slice::from_raw_parts solves. Thanks!

One slight warning here: if the OS updates your mapping while you hold a reference (vs. a raw pointer), you’ve entered UB-land, and there’s no way to reason about whether the new or old value is used in any given calculation, among other things. It may be safer to do the copy, which will likely just update entries in the process’s virtual memory table and not actually copy any bytes in main memory.

1 Like

Is the zerocopy crate able to help here? It gives you a memory-safe way of transmuting a struct to a byte array.

Ryan Levick also did a stream on transmute (which is pretty much what you're trying to achieve) which explores the topic in a lot of detail.

Concretely, something like:

  1. prog db-reader mmaps "main.db" read only
  2. prog db-writer mmaps "main.db" read-write
  3. prog db-reader generates a &[u8] pointing to somewhere inside main.db
  4. prog db-writer writes to same location
  5. prog db-reader is now in UB

?

Thanks. I did not consider this until now.

@Michael-F-Bryan : the zero-copy crate looks very cool; thanks! Given the mmap potential UB that @2e71828 just pointed out, I may just go with zero-copy instead. Thanks!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.