Soundly representing simultaneously mutable non-overlapping sub-slices of the same slice

I've realized that my understanding of the rules for mutating data that exists behind a shared reference is very confused when it comes to slices. I believe my confusion stems from thinking of the slice as an object in itself, vs thinking of its entries as objects in themselves, vs thinking of various sub-slices as objects in themselves (vs all three things).

To this end, I hacked together a small example to test my understanding, and according to MIRI something is definitely not right. Maybe someone can point me in the right direction.

In the code below, Foo<T> is just an owned contiguous slice of data. A FooMut<T> represents a mutable sub-slice of a Foo<T> (the mutability is not the point here, in real life there'd be both a mutable and a non-mutable version, but I'm trying to limit the amount of code in the example). Creation of a FooMut<T> directly from a Foo<T> using Foo::mut_view is governed by normal ownership rules, so only a single one can exist at any time. Nothing strange there. Instead of creating a FooMut<T> directly from a Foo<T>, one can alternatively create a DynMut<T> using Foo::dyn_mut. A DynMut<T> allows handing out multiple FooMut<T> to the same Foo<T> as long as the sub-slices are non-overlapping (tracked at run-time, see DynMut::mut_view for handing out sub-slices at runtime, and DynMut::reclaim for giving them up).

use core::marker::{PhantomData};
use std::collections::{HashSet};

struct Foo<T> {
    data: Box<[T]>
}

impl<T> Foo<T> {
    fn dyn_mut<'a>(self: &'a mut Self) -> DynMut<'a, T> {
        DynMut {
            foo: self,
            in_use: HashSet::new()
        }
    }
    
    fn mut_view<'a>(self: &'a mut Self, offset: usize, len: usize) -> FooMut<'a, T> {
        if offset + len > self.data.len() {
            panic!("OOB")
        }
        if isize::try_from((offset + len)*core::mem::size_of::<T>()).is_err() {
            panic!("Stated safety requirement for ptr::add violated")
        }

        FooMut {
            data: self.data.as_mut().as_mut_ptr(),
            offset: offset,
            len: len,
            phantom: PhantomData
        }
    }

    fn new<V: Into<Box<[T]>>>(data: V) -> Self {
        Self {
            data: data.into()
        }
    }
}

struct DynMut<'a, T> {
    foo: &'a mut Foo<T>,
    in_use: HashSet<(usize, usize)>
}

impl<'a, T> DynMut<'a, T> {
    fn reclaim<'b>(self: &'b mut Self, x: FooMut<'a, T>) {
        let interval: (usize, usize) = (x.offset, x.offset + x.len);
        self.in_use.remove(& interval);
    }
    
    fn mut_view<'b>(self: &'b mut Self, offset: usize, len: usize) -> Option<FooMut<'a, T>> {
        if offset + len > self.foo.data.len() {
            panic!("OOB")
        }
        if isize::try_from((offset + len)*core::mem::size_of::<T>()).is_err() {
            panic!("Stated safety requirement for ptr::add violated")
        }

        let x: (usize, usize) = (offset, offset + len);

        if (& self.in_use).into_iter().any(|y| (x.0 >= y.0 && x.0 <= y.1) || (x.1 >= y.0 && x.1 <= y.1)) {
            None
        }
        else {
            self.in_use.insert(x);
            Some(FooMut {
                data: self.foo.data.as_mut().as_mut_ptr(),
                offset: offset, 
                len: len,
                phantom: PhantomData
            })
        }
    }
}

struct FooMut<'a, T> {
    data: *mut T,
    offset: usize,
    len: usize,
    phantom: PhantomData<&'a T>
}

impl<'a, T> FooMut<'a, T> {
    fn mut_entry(self: & mut Self, i: usize) -> &'a mut T {
        if i >= self.len {
            panic!("OOB")
        }
        unsafe { &mut *(self.data.add(self.offset + i)) }
    }
}

fn main() {
    let v: Vec<u64> = (0..10).collect();

    let mut foo: Foo<u64> = Foo::new(v);

    let mut dynmut: DynMut<u64> = foo.dyn_mut();

    let mut fdm1: FooMut<u64> = dynmut.mut_view(1, 3).unwrap();
    let mut fdm2: FooMut<u64> = dynmut.mut_view(5, 3).unwrap();

    *fdm1.mut_entry(1) = 42;
    *fdm2.mut_entry(1) = 4242;

    let mut foomut: FooMut<u64> = foo.mut_view(0, 10);

    for i in 0..10 {
        println!("{}", foomut.mut_entry(i));
    }
}

I'm guessing I need to protect some data with an UnsafeCell, but I'm not sure which. The entire contents of Foo<T>'s data, or each individual entry? Or am I completely misunderstanding something else here?

I don't believe that UnsafeCell should be required here because you're getting the mutability permission from &mut. Instead, I suspect the problem is that this line is asserting exclusivity over the entire slice, making future accesses of previously made pointers illegal:

(inside DynMut::mut_view)

                data: self.foo.data.as_mut().as_mut_ptr(),
1 Like

Interesting. Would it be sound if I did self.foo.data[offset..offset+len].as_mut().as_mut_ptr() instead?

if you read the miri error message carefully, it's complaining about this line. the problem is, each time you create a new view, you call self.data.as_mut(), which reclaims the exclusivity of the data, so the previous borrows became invalid.

you just store the raw pointer instead of &mut Foo<T> in DynMut:

struct DynMut<'a, T> {
    data: *mut T,
    len: usize, // len is used for the pre-condition assertion
    in_use: HashSet<(usize, usize)>
}
6 Likes

I was able to get your test case to pass MIRI by storing raw pointers inside DynMut instead of a reference. There may be other valid ways to go about it, though.

2 Likes

Thanks a lot! But just to aid my understanding of what's going on here: Would it, alternatively, work to do data: (&mut self.foo.data[offset..offset+len]).as_mut().as_mut_ptr() when handing out a new FooMut from DynMut::mut_view?

Hmm, do you really mean the quoted line (which is from Foo::mut_view), or do you instead mean data: self.foo.data.as_mut().as_mut_ptr() from DynMut::mut_view?

I'm not sure, but I suspect it's not allowed. A good rule of thumb is that any time you touch a reference, all previously-created raw pointers may become invalid (this is significantly stricter than the actual rules, but easier to understand).

1 Like

This sounded really helpful at first, but then I realized it pokes right at my confusion about "the object that is a slice" and "the objects that live within a slice".

FooMut::mut_entry hands out references to single elements within the subslice represented by a FooMut. Would such a reference then invalidate the raw pointer that FooMut holds? Of course, mut_entry doesn't hand out a reference to the slice itself, rather to a single element within it, and FooMut holds a pointer to a slice, not to such a single element. But in terms of soundness I easily confuse the two: if I hold a reference to a slice, do I (for the purposes of the rule you sketch) also hold a reference to each and every element within said slice?

Sort of, but it's the best assumption to make. It's not so much about what you hold, and more about the actions you take. If you hold a reference to a slice, you are always allowed to access any element of the slice from that reference. If you do, any previous *mut to that element will be invalidated.

If you call a method that takes &mut [T], it may invalidate any of those pointers based on what it does internally. In the absence of documentation about that, you have to assume that it will.


I did some experimenting, and discovered that this version passes MIRI as well, and it's much closer to what you originally wrote: It just stores &'a mut [T] instead of &'a Foo<T>. So, it appears that what was really happening in your original code is that calling Box::as_mut() was the part asserting exclusivity over all of the slice elements by generating a new &'a mut [T] that covers the whole slice, which invalidated any pointers derived from the old &mut [T].

I'm not sure if I'd trust this version in my own codebase, though, because I don't fully understand exactly the points that you're asking about...

1 Like

Thanks for all your help. This has been very enlighetning. I'll still leave the question marked as unsolved for a bit, since even though your understanding is clearly better than mine, there seems to be a common core of something we don't understand. Hopefully someone comes along to educate us both.

To bring up one thing you wrote in particular:

If you hold a reference to a slice, you are always allowed to access any element of the slice from that reference. If you do, any previous *mut to that element will be invalidated.

I thought it was OK by provenance rules to use a pointer to a slice to access any of the elements within said slice. Thus, a pointer to a slice is – in essence – a pointer to any element within the slice. Or at least it can become one in a sound way. Does that not mean that accessing any slice element using a reference to the slice invalidates any *mut to any element, including the one to the beginning of the slice (i.e. a *mut to the slice itself)?

Mostly, it does mean that. The only exception that I know of is if the &mut [T] was somehow derived from a particular *mut T or *mut [T], then that particular raw pointer isn't invalidated by using the reference. But messing with that parent pointer in certain ways (exactly which ones I'm a bit unclear on...) before the last use of the reference is also problematic.

1 Like

In case it helps, when working with slices I like to build things around split_at_mut as a kind of "primitive" because I trust the std developers to not make as many stupid mistakes as myself.

1 Like

reclaim at least is still unsound. You could have two DynMut from different Foos and return a FooMut from the first to the second, confusing its state. You need to check that the returned FooMut is in fact covered by the DynMut.

(Based on @2e71828's last playground. I didn't do a full audit.)

3 Likes

Thanks. It won't really help me here though, because the example program isn't really trying to solve the goal that it solves. It's an exercise for me to uncover things I don't know about soundness when fiddling with raw pointers and references at the same time (and boy did you folks help me uncover a lot! :slight_smile: )

Very good point! Thanks!

For future reference: I just realized that there's a major design flaw in this setup. My focus was on soundness relating to raw-pointer-reference interactions, so I wasn't paying as much attention to other things, but of course the lifetime of the FooMuts handed out by DynMut is wrong: as it currently stands, one can have a FooMut that can never be reclaimed because the DynMut that handed it out no longer exists.