Trying to understand mutable reference safety

The Tokio project has this article discussing the safety of their intrusive link lists, which contains this interesting quote:

The way this should be interpreted when raw pointers are involved is that whenever you create or use a mutable reference to some value, all raw pointers to that value are invalidated (excluding the raw pointer the reference was created from).

I find this claim very confusing, because it seems to me that if true Rust can't express a regular doubly linked list let alone an intrusive doubly linked list.

Assuming my nodes look something like this:

struct Node<T> {
    data: T,
    next: *mut Node<T>,
    prev: *mut Node<T>,
}

In order for my list to present an interface that lets users modify the data inside the list, I am going to need to give them some way to get a mutable reference to data. In order to do that, at some point I need to form a mutable reference to Node<T>. If that invalidates all other pointers to the node, then the node before this one and node after this one now contain invalid pointers. You could reach into those nodes and reset their pointers, but doing so would require mutable references to them, which would then invalidate the pointers held by the nodes before and after them, which then contagiously spreads so as soon as you form a mutable reference you need to touch the entire list to update it... which seems wrong.

Prior to reading this article my assumption had been that I just had to make sure that there was only one live mutable reference at a time, but that multiple pointers simultaneously are fine. Is this wrong?

1 Like

I don't think there is a contradiction here. Given a *mut Node<T>, you can get a *mut T pointing at its data field using raw pointers, then turn just that data pointer into a &mut T. In the process we aren't touching any pointers to adjacent nodes so they shouldn't be invalidated.

Also keep in mind that the precise subtleties of these things are still an active area of research. If you want to find out more, Ralf Jung is one of the people working on the Stacked Borrows model and regularly posts articles about it on his blog. This is part of the Unsafe Code Guidelines working group, who's goal is to produce a reference outlining what unsafe code is and isn't valid in Rust.

It is important to remember the created-from exception, because it is quite central. The way I think of it is that copying a raw pointer yields two raw pointers that are "the same", so if you use one of them to create a mutable reference, they both count as its parent, and hence neither is invalidated.

It would be nice to flesh out these details in the article, but we run into the problem that @Michael-F-Bryan mentions that figuring out what the precise semantics should be is an active research area.

The main place where you can run into trouble is if the raw pointers you are storing are obtained from a mutable reference, because then they get detached from each other and are no longer the same.

Note: I'm the author of the linked article.

3 Likes

Given a *mut Node<T>, how do you make &mut T referring to data without forming an intermediate &mut Node<T>? I concretely don't know how to express that.

In the process we aren't touching any pointers to adjacent nodes so they shouldn't be invalidated.

The quote says all pointers other than the one that created the reference though, that is what is confusing. I'm saying the untouched pointers that adjacent nodes have pointing to this node (not to be confused with pointers the current node has to adjacent nodes) seem to be invalidated under this model.

In this case, it's not actually necessary to go through that. The parent property is recursive, so going from *mut Node<T> → &mut Node<T> → &mut T with an ordinary field access is fine and doesn't invalidate the original raw pointer.

Of course, the other part of the uniqueness rule does still apply, so you can't access the data field through the original raw pointer between two uses of the same mutable reference created in this manner.

To answer your question though, you put #[repr(C)] on the struct, guaranteeing that the data field is at the start, then just cast the *mut Node<T> to a *mut T.

1 Like

Yes, and you need the created-from exception to handle this problem.

I don't see how it helps. Where in this reasoning chain am I going wrong?:

  • I want users to be able to mutate T stored in my linked list
  • That requires giving them a way to get &mut T (assuming I am not forcing them to use Cell types)
  • To get a &mut T I need a &mut Node<T>
  • Forming a &mut Node<T> invalidates all other pointers to that node except the one I dereference.
  • Say I have a list A<->B<->C
  • I call mylist.iter_mut()
  • I start at head (A) and follow next to B. I take A's next ptr and coerce to mutable reference, &mut *a.next. This doesn't invalidate a.next, however, it does invalidate c.prev (because it was not used to form the reference). My linked list is now broken?

Recall this quote, whose point I admit is somewhat missing from the post.

So the answer to your question is that a.next and c.prev need to be copies of the same raw pointer, which will cause both to be considered a parent of your mutable reference.

In general, any copies of the raw pointer you originally got from the allocator cannot be invalidated because such a raw pointer would be the parent (maybe recursively) of all references to the allocation.

I see, if all copies of the dereferenced pointer stay valid, that covers a lot of use cases. But then there are questions about what constitutes a copy vs a "fresh" pointer.

  • So if I have a usize U, I coerce it to pointer A and deref it to form a mutable reference X, then drop X, then I coerce U again to a pointer B and deref it to form a mutable reference Y, does the creation of Y invalidate A?

  • I have two mutable pointers A and B, that point into the same slice. What element inside the slice they point to is a result of computation. Sometimes, that computation may cause A and B to point at the same element. Does making a mutable reference from A invalidate B? This would lead to absurd workarounds, like checking if they are equal before derefing, then assigning one to the other only if they are equal! :stuck_out_tongue:

Things get very complicated once you involve integers (see e.g. this). I generally consider all integer to pointer casts sketchy.

As for your other question, unless the parent rule kicks in, then yes that would invalidate the other of them. Or at least, it would invalidate its access to that index.

@RalfJung Just to double check, do you have any input on the original question in this thread?

See the raw_mut!() macro and associated RFC for &raw T references. They let you do the necessary pointer arithmetic without creating an intermediate &mut reference.

1 Like

If both pointers are computed by offsetting from the pointer that was originally returned the slice, then the parent rule applies? They are in some sense the 'same' pointer?

So... would the workaround be this?

let deref_p1 = &mut *(if(p1 == p2) { let tmp = &mut *p1; p2 = p1; tmp } else { *p1 })

That seems like design smell.

The thing you're doing with comparing them certainly sounds like something that shouldn't be necessary.

Yes, that is exactly right. Each time you turn a reference to a raw pointer, that's a "new" raw pointer that is distinguished by the aliasing model, but if you just copy an existing raw pointer or do pointer offset operations, those are all "the same".

So you can implement a doubly-linked list just fine as long as you make sure you use "the same" raw pointer(s) throughout and never create new ones.

(I'm afraid I don't have time right now to read this entire thread, I hope that was the key question that needed clarification. :slight_smile: If not, please let me know, and sorry for responding in such a hurry.)

On Tuesdays it does... no seriously though, you are asking exactly the right question, and the answer is that we don't know. The interaction of memory models with "provenance" and integer-pointer casts is a thorny one.

Invalidation is an entirely dynamic concept; what matters is whether pointers overlap at runtime in the execution under consideration. Whether that overlap can be predicated statically or not is irrelevant.

If that doesn't answer the question, then I think I misunderstood it... could you write a short piece of self-contained code that demonstrates the problem?

8 Likes

As I understand it, making a pointer from a reference causes invalidation, not creating a reference from a pointer.

There's a distinction here in how A and B were created. If there was some &mut that got turned into a pointer and then arithmetic on that pointer created both A and B, they're the "same" pointer.
Making a mutable reference from one doesn't invalidate the other. You still need to make sure the new &muts don't overlap, though.

On the other hand, if they were each independently created from an &mut at different times, one of them is already invalid at this point. That happened when the second pointer was created.

Since this is a common question that has come up several times, I have added a link to this thread in the document.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.