Splitting a mmap'd region into chunks

Hi,

I've been playing around with some ideas on chunking a memory mapped region and was wondering if one of my recent iterations makes any sense. I've posted some code up on the playground here (apologies for the length), but the general gist is:

  1. Create num mutable disjoint slices pointing into the mmap'd region, by offsetting the region's raw pointer.
  2. Pack these disjoint slices into Chunk structs, along with an Arc<Mmap> - this is to ensure that the slice won't outlive the mmap'd region.

I'm wondering if:
a) This is safe to do (and sensible)
b) Supposing it is ok to do - would using a 'static reference for the slice in Chunk also work, since we have the mmap'd region along with it in an Arc? E.g. Chunk<'mmap> would become:

pub struct Chunk {
    // `slice` is a unique portion of `_mmap`.
    slice: &'static mut [u8],
    _mmap: Arc<Mmap>,
}

Any and all feedback is appreciated, and thanks in advance!

You may want to ask it's sound. Things are safe if you don't write any unsafe keyword by yourself. Things are sound if you have some but it's OK to expose it as a safe API as you've checked every possible cases.

Actually no. Once the &'static [mut] T is constructed the T value behind it must be there for the entire lifetime of the process - in this case you must not munmap the region.

Generally it's not a good idea to store references/smart pointer types when you juggle unsafely with the pointers to the value behind them. They have a lot of rules you need to manually safisfy in the unsafe context. Instead you can store raw pointers - indirection type with least guarantee so least restrictions - and produce &mut [u8] in its DerefMut impl or other methods. Arc<Mmap> here is OK unless you're juggling with the *mut Mmap itself.

And of course, never forget to run it on the miri if it has some unsafety! It may detect some UB cases human eyes missed.

# install nightly toolchain
rustup add nightly
# install miri component
rustup +nightly component add miri
# enable raw pointer tracking, assuming unix shell env
export MIRIFLAGS="-Zmiri-track-raw-pointers"
# run cargo test but on miri
cargo miri test

I believe that this would be ok if you used raw pointers, though there may still be some trouble if you allow mutable overlapping access to the mmap, though that depends on details.

Thank you for the replies!

@Hyeonu Ah yes, indeed it was unsoundness that I was concerned about. Regarding the 'static lifetime on the reference, iiuc having the Arc<Mmap> stored alongside it (where the Drop impl for Mmap calls munmap) would guarantee the memory it points at remains valid at least at long as it does - does that sound correct?

And I didn't think of using miri, thanks for the suggestion! I'll rewrite the tests for a few different impls and see what miri thinks of my abominations.

Out of interest, and this is something @alice also touched upon - if I'm using raw pointers which I eventually turn into a &mut [u8] on demand (via slice::from_raw_parts_mut), why would that be different (with regards to rules I need to satisfy in the unsafe context) to storing the &mut [u8] slice in the struct itself?

But following both your suggestions it sounds like I should go with something like the below (where all [addr, addr + len] memory regions are disjoint), thanks for the help!

/// A non-overlapping chunk of mapped memory.
struct Chunk {
    addr: NonNull<libc::c_void>,
    len: usize,
    mmap: Arc<Mmap>
}

impl Chunk {
    pub fn as_slice(&self) -> &[u8] {
         unsafe { slice::from_raw_parts(self.addr.as_ptr() as *const u8, self.len) }
    }
    
    pub fn as_slice_mut(&mut self) -> &mut [u8] {
         unsafe { slice::from_raw_parts_mut(self.addr.as_ptr() as *mut u8, self.len) }
    }
}

/// A region of mapped memory.
pub struct Mmap {
    addr: NonNull<libc::c_void>,
    len: usize,
}

impl Drop for Mmap {
    fn drop(&mut self) {
        unsafe {
            libc::munmap(self.addr.as_ptr(), self.len);
        }
    }
}

The main rule you need to follow when using a raw pointer that you turn into a &mut [u8], is that between any two uses of that mutable reference, nothing else than that mutable reference may have accessed the region of memory accessible by the mutable reference. The creation of the mutable reference counts as a use, but its destruction does not.

Hence, if you ever end up with two Chunk objects that could access the same region of the mmap, you are in trouble because they could each call as_slice_mut and violate the above rule via them.

2 Likes

Gotcha, that makes sense. Thank you for clarifying

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.