Mut aliasing safety and requiring acyclic data structures

At this point I've got a use to assuming in safe Rust that I generally need to keep my data structures acyclic. I'm trying to better understand what I can get away with in unsafe code.

I understand that it's definitely undefined behavior if I ever have two &mut referring to the same object. I also understand that if I have a &mut referring to some object it is possible for me to turn that into a &mut to anything contained directly or indirectly by that object so long as fields are all public.

My question is is it okay (defined behavior) If I write unsafe code that makes it so that the user might have access to &mut for an object a at the same time as &mut for an object b, where b is somehow reachable from a, but not via any public safe interface, so that in practice two &mut to b never coexist.

In other words if I use other features of the language to make sure that in practice users writing safe code can't turn the &mut to a into a &mut to b, is it ok that b is technically reachable from a?

The first way I can imagine doing this is simply that a contains a private pointer to b. Pointers can't be dereferenced in safe code anyway, and I assume then it's up to me and my implementation to make sure that I never mutable dereference the private pointer while another dereference of it is still alive, and make sure I haven't exposed any safe functions that would trigger this.

The second way I could imagine it is direct ownership, where a privately contains a b, but where the interface for a doesn't expose b directly. If the public methods of a only let you get &mut to components of a other than b, I assume it is safe for &mut to a to exist in the program at the same time as a &mut to b?

Check out this document:

3 Likes

I had to go reread the discussion from last time about that article, but I think the takeaway was: when you form a pointer from a mutable reference, that pointer and any copies of it that are made form what we could call a pointer group. The next time a new pointer is made "from scratch" by converting the same mutable reference, It starts a new pointer group and all the pointers in the old pointer group are invalidated.

For my two examples in this thread, I think that can be avoided? In the case that a directly contains b, and a's interface only lets you get &mut to components of a other than b, those getters likely don't use pointers at all and just return references to a field. However to get access to a &mut b at the same time we will at some point in the past need to have stored a pointer to b. As long as we then only ever use that pointer or copies of it, and don't recreate it from scratch, then everything is fine right?

1 Like

I don't have a confident answer regarding immediate UB in this situation, but have you considered that if I have a foo: &mut Foo, I can make all pointers to anything within foo (public or private) dangle by doing *foo = Foo::new()?

That's a good point, that's an extra requirement. So in this example since we can only prevent a &mut to b if b is private, and we can only be sure it stays valid if users have no way to construct another instance of a to swap with. b being private already means they can't directly construct an instance, but we will also have to be sure that we don't expose a new-like method. Is that sufficient?

Well, your users got a &mut Foo from somewhere, right? So they can probably get a second one in a similar fashion, and then instead of *foo = do std::mem::swap(), even if they can't construct Foos themselves. You'd still be pointing at something in that case, at least. But you can't assume it's the same thing any more. Also, it would be a data race, and those are considered UB.

And even if they can't make the thing directly, they can probably make them indirectly (e.g. as a private struct of something else they can make), right? That's true of most useful things anyway. Then they could still clobber your pointer with an assignment of those (though they couldn't use their mutable reference afterwards).

So you'd have to solve the singleton problem so they can't make two of something. (There's some run-time approaches to do this in, e.g., the embedded space; I'm unaware of a compile time way to do so.) But I imagine that's not actually what you want to create.


I'm pretty sure you'll still be violating stacked borrows though, with any of this, after thinking a bit.

To be more concrete, you're thinking of something like like the following, right?

pub struct ThingTheyCannotMake {
    pub thing_the_can_mutate: String,
    thing_they_cannot_mutate: String,
}

pub struct Container {
    ttcm: ThingTheyCannotMake,
}

// You'd want a lifetime on this presumably...
pub struct Snoop {
    pointer: *mut String,
}

impl Container {
    pub fn accessor(&mut self) -> (&mut ThingTheyCannotMake, Snoop) {
        let ttcm = &mut self.ttcm;
        let pointer = &mut ttcm.thing_they_cannot_mutate as *mut _;
        (ttcm, Snoop { pointer })
    }
}

Any use of the returned ttcm is going to invalidate Snoop::pointer (since that's where the pointer was derived from), so it would be UB to use it.

And independently of that, I'm pretty sure they're never going to say having a &mut Container and &mut Field both usable at the same time is anything but UB at the language level. Even if there was a tractable way to show that it wasn't being exploited on your side in a program-behavior-wrecking way, it would violate the rule that a mutable reference means exclusive access to everything within -- a rule that other unsafe code may rely on. (Or safe code for that matter, ala std::mem::swap.)

Here's a very long IRLO thread on the topic.


Have you considered something more like this?

pub struct UserStuff;
struct PrivateStuff;

pub struct UsefulThing {
    user_stuff: UserStuff,
    private_stuff: PrivateStuff,
}

pub struct Snoop<'a> {
    stuff: &'a mut PrivateStuff,
}

impl UsefulThing {
    // It's definitely ok for these two to co-exist
    pub fn accessor(&mut self) -> (&mut UserStuff, Snoop<'_>) {
        let snoop = Snoop { stuff: &mut self.private_stuff };
        (&mut self.user_stuff, snoop)
    }
}

(It might not be applicable to your use case at all. Probably you need reference counting or some other interior mutability of some sort.)

In this instance Foo is a type from my crate with no public constructor. Users can only ever get a &mut Foo via callback from me, and I never pass two into the same callback at once. The borrow checker should stop them from holding onto the &mut Foo for longer than the callback I think?

Well on this point I'm confused. In the last thread about the linked article it sounded as if pointer group invalidation occurs when making a new pointer from a mutable reference, not when dereferencing a mutable reference. This would mean dereferencing ttcm is fine as long as you don't also have a live mutable reference at the same time from dereferencing Snoop::pointer.

Well that's the question. Do we usually just assume those things go together because usually a mutable reference to Container implies it's possible to get mutable access to ThingTheyCannotMake, or does Rust actually require it in which case why isn't the language used in the docs talking about what objects are reachable from other objects?

(Sorry for the brevity in this reply; on mobile.)

Callbacks are an interesting approach, but not enough to protect you on their own. Playground

The way this should be interpreted when raw pointers are involved is that whenever you create or use a mutable reference to some value, all raw pointers to that value are invalidated (excluding the raw pointer the reference was created from)

It's any use, including but not limited to creating new references / pointers.

As for uniqueness, it is assumed, which is how you can swap without a data race in safe code despite private fields, say. This article, linked to by the other one, covers it well. As it notes, spelling mut as uniq was even considered.

I agree it would be good for whatever docs you're referring to point this out. (Docs about private fields I assume. )

To be explicit about the things from the document I linked, having a &mut to an object along with a &mut to a field of that object (stored in the object directly without indirection) is definitely UB unless one of the references was created from the other.

If the field is behind a raw pointer, then it's fine as the compiler doesn't look through raw pointers.

Is the right mental model for this that in that case both references are considered to be actually pointing at the same "object" from the point of view of the aliasing model? I'm inferring based on the docs for pointer's wrapping_add and similar ops.

Well, the aliasing model generally guarantees no overlap between mutable references.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.