Is it fine to have multiple mutable pointers to something if I don't dereference more than one?

Is this OK to do?

fn main() {
    let test = &mut 77f32;
    
    {
        let ptr1 = test as *mut f32;
        let ptr2 = test as *mut f32;
        
        unsafe {
            *ptr2 = 32f32;
        };
    }
    
    assert_eq!(*test, 32f32);
}

In the actual app the circumstance is that I have an API that needs to return a pointer to an object. The consumer of that API knows whether or not it is safe to mutate the pointer because it has exclusive access, or whether it is safe to read the pointer, because all of the other pointers will only read as well. AKA, the client is doing the borrow checking, so my API was just returning mutable pointers that it expects the client to cast to what it actually needs and knows that is safe. ( this is for an ECS scripting API )

Does this work? Is two mutable pointers UB if I don't dereference it? I assume just having them can't be UB because I can actually create 100 mutable pointers to the same object in 100% safe Rust, as long as I don't dereference it.

fn main() {
    let test = &mut 77f32;
    
    let mut v = vec![];
    
    for _ in 0..100 {
        v.push(test as *mut f32);
    }
}

I just realized that I'm pretty sure that proves it, but I'll leave the post up anyway in case there is more wisdom to be gained. :owl:

Oh, like would it be better to leave them *const pointers and then the client can decide that it's safe to cast them to *mut pointers if it needs to mutate?


PS: I just realized that I do know the size and the lifetime of the pointer I have, even if I don't know the mutability, so it's probably just safer to return a &mut [u8] and let the client handle casting that as it needs. Still, maybe returning a &[u8] that the client can cast to a &mut [u8] is better than always returning a &mut [u8] and expecting the client not to mutate it if it isn't allowed to.

Except it seems like Rust doesn't like you to transmut from &[u8] to &mut [u8] without an ( overridable ) compile error, so maybe just returning &mut [u8] does make sense?

Sorry for the sort of rambling post :roll_eyes:

2 Likes

Except that this cast is insta-UB.

5 Likes

The nomicon has dedicated 4 lines of kind description for this case.

Transmuting an & to &mut is UB.

  • Transmuting an & to &mut is always UB.
  • No you can't do it.
  • No you're not special.
19 Likes

The problem is, that this let binding creates a temporary variable inaccessible to the programmer, that owns the number and the mutable borrow is strongly tied to the same lifetime as the owned value, i.e. it is absolutely impossible to soundly dereference the same value from a raw pointer from anywhere else. In order to safely access the same value from a raw pointer, you need to drop test and therefore revoke the exclusive access, before dereferencing either ptr1 or ptr2. However, doing so also drops the hidden variable the exclusive borrow was referring to, i.e. dereferencing ptr1 or ptr2 after dropping test would lead to a use-after-free, which is UB.

Regarding returning &mut or &, just provide 2 methods with the same name, except one of them with _mut suffix. All you have to do is call the non-_mut version in the _mut version and the &mut will be coerced to & just fine. However, the value you created (in your examples) won't live long enough, because it resides on the stack and only within the function. You'll have to return the owned value, in that case.

EDIT:
P.S.: To be pedantic, it is technically not UB with the current compiler version, because Rust does not emit no_alias to LLVM, due to unresolved bugs in LLVM leading to incorrect optimizations. AFAIK, that means, currently every &mut T behaves like &UnsafeCell<T> until the bug in LLVM is fixed and Rust emits the optimization hint, again. Rust still enforces correct behavior in the safe subset, though, i.e. it is as strict as it has to be to be able to work in the future with aliasing optimizations turned on, again. I'd strongly advise against exploiting that knowledge, just because you can, because that is like planting a ticking time-bomb with an opaque countdown.

1 Like

I don't believe that's correct. Both ptr1 and ptr2 are derived from test, so this code should be fine. If the raw pointers were derived from another borrow expression, like in the following code:

// THIS IS HORRIBLY WRONG:
let mut value = 88.0;
let ptr1 = &mut value as *mut f32;  // use of loan 1
let ptr2 = ptr1;
let test = &mut value;  // use of loan 2

unsafe {
    *ptr2 = 32.0;  // use of loan 1
}

assert_eq!(*test, 32.0);  // use of loan 2

then you would be right.

It's not proof obviously but I submit as evidence the fact that Miri accepts @zicklag's original, but rejects my modified version.

(I'm not commenting on the "transmute-&-to-&mut" idea, which I agree is UB -- this just pertains to the code in the post.)

Huh, that's weird and quite smart. That would imply, that MIRI observes a reborrow-like construct when performing test as *mut f32 except it doesn't create a new borrow, but a raw pointer. In that case, creating the raw pointer twice from the mutable borrow should be either disallowed or UB (you'd have to copy the existing raw pointer), because the outer mutable borrow will be disabled for the lifetime of the reborrow. Except in this case, I guess, ptr1 can be immediately dropped, re-enabling the mutable borrow before it is re-disabled when reborrowed for ptr2. The outer mutable borrow becoming disabled is the only explanation I can think of for why this is not UB. Either that or MIRI has a bug.

@alice Sorry for pinging, but I know you have a better understanding of Stacked Borrows than me. Do you know, if reborrowing into a *mut T is what's happening here under the hood and the fact, that ptr1 can be dropped immediately is important to not cause UB?

Yes, this is ok, but it wouldn't be if you accessed ptr1 in the unsafe block instead of ptr2, because the creation of ptr2 accessed the mutable reference, invalidating ptr1. That said, miri doesn't catch this particular case due to this.

Raw pointers have no destructors, so destructors do not come into play here.

5 Likes

Ah, OK.

I'd have thought the compiler error message when attempting that would have been a tad more forceful then:

error: mutating transmuted &mut T from &T may cause undefined behavior, consider instead using an UnsafeCell

"May cause undefined behavior" is a bit different than "Will always cause undefined behavior".

The "may" verbiage seemed a little weird to me also because I was thinking, "I'm already in an unsafe block where I can do all kinds of things that may cause undefined behavior.


I'm actually in a situation where I have to ( or currently am because I can't find another way to do it ) return all of the references, whether actually mutable or not in an array, in which case they all have to have the same mutability.

So the return type for my function is: [Option<&'a mut [u8]>; COMPONENT_QUERY_SIZE].

So each one of the elements in that array are marked as mutable, but only some of them, or potentially none of them, should actually be mutated, because they are potentially borrowed somewhere else. The client to this function has indicated whether it needs mutable or immutable access to each element in the array, and the scheduler in the ECS will have made sure that the borrowing rules are respected.

here is my actual code. I have a struct like this:

pub struct DynamicFetch {
    datas: [Option<NonNull<[u8]>>; COMPONENT_QUERY_SIZE],
}

It obtains a bunch of NonNull<[u8]> from the ECS Archetype which uses UnsafeCell internally:

        let mut fetch = Self {
            datas: [None; COMPONENT_QUERY_SIZE],
        };


        let mut matches_any = false;
        for (component_index, component_access) in state
            .iter()
            .enumerate()
            .filter_map(|(i, &x)| x.map(|y| (i, y)))
        {
            let ptr = archetype.get_dynamic(
                component_access.info.id,
                component_access.info.size,
                // FIXME: Is this right for the index?
                0,
            );


            if ptr.is_some() {
                matches_any = true
            }


            fetch.datas[component_index] = ptr.map(|x| {
                NonNull::new_unchecked(slice_from_raw_parts_mut(
                    x.as_ptr().add(offset),
                    component_access.info.size,
                ))
            });
        }

So when doing this I'm looping over state which contains the list of components that the client wants to access. I use this list to query the archetype and get a bunch of pointers to the components that it needs. I then create slices from the pointers with the size info I have about the components that are being requested.

When I return this to the client it's in a sort of iterator:

    unsafe fn next(&mut self, state: &Self::State) -> Self::Item {
        const INIT: Option<&mut [u8]> = None;
        let mut components = [INIT; COMPONENT_QUERY_SIZE];


        for (component_index, component_access) in state
            .iter()
            .enumerate()
            .filter_map(|(i, &x)| x.map(|y| (i, y)))
        {
            if let Some(nonnull) = &mut self.datas[component_index] {
                components[component_index] = {
                    let x = nonnull.as_ptr();
                    *nonnull = NonNull::new_unchecked(slice_from_raw_parts_mut(
                        (x as *mut u8).add(component_access.info.size),
                        component_access.info.size,
                    ));
                    Some(&mut *x)
                };
            }
        }


        components
    }

I add the component_access.info.size to the pointer to get the next slice of bytes for that pointer. And then I put a mutable reference to *x and put that in the array that gets returned.

I'm not sure of a better way to return a list of potentially mutable references. I could just return a bunch of NonNull<u8>'s and then have the client cast them, but in the DynamicFetch I do know the length and the lifetime of the pointer, so it seems a shame not to make it that little bit safer.

Still, if I create a &mut [u8] to some data which should only be used as a &[u8] is that fine? Say that I dished out a bunch of these &mut [u8]'s which which were only actually safe to read from, would that work or would the only valid &mut [u8] be the last one that was created because of what @alice mentioned:

Sufficiently complex control flow can block the compiler from code optimizations that would constitute UB. Thus it's "may" rather than "will". Which is little consolation, as there's no way to ensure that compiler improvements won't surface that UB in future rustc releases.

1 Like

You can store an enum in the returned array to make the choice element-by-element:

enum BufAccess<'a> {
    Shared(&'a [u8]),
    Exclusive(&'a mut [u8]),
    None
}

fn query(&self, args)->[BufAccess<'_>; COMPONENT_QUERY_SIZE];
1 Like

No, it will always invoke UB, but UB allows the compiler to do anything, and it behaving as you wanted it to is included in everything.

Miscompilation is just a symptom of UB.

The act of creating an &mut reference asserts exclusive access to the pointee. If you don't have exclusive access, it is unsound.

5 Likes

It's quite difficult to say without knowing about everything about your project (and I don't intend to dive that deep, even if I could), because there is the simple alternative of using an enum like:

enum MaybeMutableByteSlice<'a> {
    Mutable(&'a mut [u8]),
    Immutable(&'a [u8])
}

The problem of using an enum is, that if the pattern of mutable and immutable references is random (which I suspect is true), this will incur a branch prediction failure on each enum. However, if the client mutates data that is supposed to be immutable, it becomes even worse than just a performance problem.

EDIT: The better alternative performance-wise would be to keep 2 vectors, one for mutable and one for immutable byte slices, i.e. you hand out (&mut [&mut [u8]], &[&[u8]]), instead.

2 Likes

Is that not the case with NonNull<u8>'s? Like can you have multiple NonNul<u8>'s soundly pointing to the same data? The standard library docs say that NonNull<u8> is a *mut T but non-zero and covariant".

I don't know what covariant means, but the ECS I'm contributing to, which is not my code, as far as I can tell, uses NonNull<u8>'s as pointers even to data which get must be read only and therefore gets casted to read only & references.

The data it references, though is in an UnsafeCell. That makes a difference doesn't it.

I didn't think of that. That's great. :+1:

Oh, that's even better. Thanks!

My (admittedly weak) understanding is that only &mut requires exclusive access, and *mut is unrestricted. If you use that *mut to produce an &'a mut, however, you must ensure that there are no reads or writes through any of the pointers while the 'a lifetime is valid.

2 Likes

Neither NonNull<T> nor *mut T is exclusive, so they can safely alias each other, or even a live &T or &mut T. The problem is how you get that pointer and whether there is or is not an interpretation of events where only one &mut T is live at a time.

I encourage you to read about the Stacked Borrows aliasing model if you want to have a deeper understanding of how aliasing relates to UB in Rust. This model is the closest thing we currently have to a definition of aliasing in Rust. It's a little more conservative than the C memory model (which is what LLVM uses) but it still lets you reason about the soundness of unsafe code that the compiler can't check.

This is accurate to my understanding. There's another concern, though: if you get the *mut by casting a &mut reference, that &mut reference now has to be valid and exclusive throughout the entire time between when you create the *mut and the last time you dereference the *mut. Copying the *mut is fine, even interleaving access between different *muts is allowable (as long as you maintain the rules of references internally) -- but all those *muts have to be derived from the same "loan" of the value that the original &mut took.

3 Likes

Yes, you can have multiple raw pointers pointing to the same data. It's just that you should create any extra raw pointers by copying the one you created initially, rather than casting the reference multiple times, because each time you touch the reference, the reference's rules trigger.

2 Likes

Ahhhh. That makes more sense.

So in my original example:

fn main() {
    let test = &mut 77f32;
    
    {
        let ptr1 = test as *mut f32;
        let ptr2 = test as *mut f32;
        
        unsafe {
            *ptr2 = 32f32;
        };
    }
    
    assert_eq!(*test, 32f32);
}

@alice, you said dereferencing ptr1 inside the unsafe block would be UB, but if I did this it wouldn't be, right?

fn main() {
    let test = &mut 77f32;
    
    {
        let ptr1 = test as *mut f32;
        let ptr2 = ptr1;
        
        unsafe {
            *ptr1 = 32f32;
        };
    }
    
    assert_eq!(*test, 32f32);
}

I think it's slightly more subtle than this. It's ok for the &mut reference to go out of scope before the last use of the raw pointer. The focus should be on what other references are used.

I prefer to think of it as "the raw pointer must not have been invalidated by a use of any other pointer/reference", plus a list of cases where using something invalidates other stuff. For example, any use of a mutable reference invalidates all other references that overlap with it, except for the reference you originally got the mutable reference from, and the reference you originally got that one from, and so on.

@zicklag Yes, your latter code block is perfectly fine.

1 Like

Agreed.

I think we're on the same page here, but just to check, do you agree that this code is sound and guaranteed to print 16?

let mut a = 10;
let p1 = {
    let ra = &mut a;
    ra as *mut _
}; // ra goes out of scope here but the borrow used to intialize it must be valid...
let p2 = p1;
unsafe {
    *p1 += 2;
    *p2 += 2;
    *p1 += 2;
    // ... until here, when it's no longer in use
}
println!("{}", a);
// Further use of `p1` or `p2` here would obviously be wrong

Each += 2 expression creates a new &mut reference, but none of them overlap, and p1 and p2 may be aliased since they are not &mut references.

That particular part of the code is what let p1 = &mut a as *mut _; desugars to (at least currently).