How to pass mutable reference through raw pointer

I'd like to store a mutable reference in thread local storage (see tokio_executor::with_default for the basic idea), which can then be used "deeper" in the stack.

std::future::set_task_context and std::future::get_task_context do the same thing.

I wonder if this is actually safe: when converting the mutable reference to a raw pointer the compiler has no way of knowing when this "borrow" (through raw pointer) ends. Is there anything preventing the compiler from reordering memory access (after inlining the callbacks of course)? Are there implicit fences surrounding unsafe blocks?

For example:

let mut x: u32 = 0;
let ptr: *mut u32 = &mut x;
unsafe { *ptr = 5; }  // maybe hidden in a closure
x = 10;
x

Is the compiler allowed to execute x = 10; before the unsafe block and to return 5 instead of 10?

If yes: would it help to store the (borrowed) lifetime of the mutable reference in a PhantomData (for the scope of the callback) to prevent reordering?

I'm not certain, but if you're worried about order of operations, use write_volatile. I personally never use *ptr = value or rvalue *ptr because it is too difficult to find documentation on whether these require alignment or if they are volatile, etc.

I'm fairly certain that this would not help. To my knowledge, lifetimes play absolutely no role during the lowering of code to LLVM IR; the borrow checker only solves them in order to reject bad programs.

To clarify, I only see two possibilities:

  • The code works as intended, and always writes in order.
  • The code invokes undefined behaivor, and any attempt to properly propagate the lifetime information will result in a borrow-check error.

In that sense, carrying around a phantom lifetime is at least a good idea when possible. (because it demonstrates correctness)


One additional thing: If you have any code where a &T (for any type T containing the pointer) can be used to mutate the data, then you must store the value inside an UnsafeCell.

1 Like

This code is not UB and will always end with x == 10, this is because as @ExpHP said, lifetimes don't affect code-gen, and after lifetimes are thrown out (fairly early in the process) raw pointers and unique references behave very similarly. You code is identical to the following (ignoring lifetimes),

let mut x: u32 = 0;
let ptr: &mut u32 = &mut x;
*ptr = 5;  // maybe hidden in a closure
x = 10;
x

And will have the same behavior, ending with x == 10. Note that in both cases LLVM may optimize out the *ptr = 5

Actually the context I'm interested in (same as the existing examples I linked) doesn't directly modify the data in an unsafe block but creates a mutable reference to it and passes it on (given it is usually a trait object). write_volatile can't be used there, and also does more than fences afaict, which should be enough.

Just because the compiler right now doesn't use it doesn't mean it never will. After all that is the idea of shared and mutable references (my "simplified" view): if I've got a shared reference to something it must not change (unless it is wrapped in UnsafeCell, as you pointed out - but it will require unsafe to actually access the data), and if I've got a (non-borrowed) mutable reference no one else can write or read it.

Especially if I'm given two (non-borrowed) mutable references I can assume they don't alias.

Example 2 (with "manually inlined" closures):

let mut x: u32 = 0;
let r1 = &mut x;
let ptr: *mut u32 = r1;
{
    let r2 = unsafe { &mut *ptr }; // this really should be UB: created aliased mutable references
    *r2 = 5;
}
*r1 = 10;
x

Behavior considered undefined - The Rust Reference mentions LLVMs noalias attribute, so I'm guessing rustc right now only uses the non-aliasing of references when entering a function? Is there anything saying rustc won't take it further than that?

I'm pretty sure my new example is UB, and while the spec is not specific about it, I'd assume unsafe { *ptr = ... } creates a temporary mutable reference regarding any alias discussion, so it should be UB too. Is there any reference claiming otherwise?

I reccomend that you run your code through MIRI, this can be done on playground. MIRI will detect aliasing violations in single threaded code and many other sources of UB.

For example, here is an aliasing violation that is caught by MIRI

Note how the two reference uses are interleaved, that is what caused the UB. In your example, the references form a strict hierarchy, and Rust can reason about that. This is how reborrowing works.

2 Likes

From https://github.com/rust-lang/miri/:

It can run binaries and test suites of cargo projects and detect certain classes of undefined behavior, for example:

  • [...]
  • WIP: Violations of the rules governing aliasing for reference types

Some violations (still a nice tool, got to take a deeper look some time). Afaict MIRI doesn't know much about how the reference was created, so it reinterprets the second reference as reborrow as long as it fits the pattern - but it wouldn't actually prevent the compiler from reordering (although MIRI should detect it, if it would work on the actual binary).

That second reference is a reborrow, even if it is kinda indirect about it. MIRI can track where references came from (even through raw pointers!), if you want to read about how MIRI works you can read these posts by the maker of MIRI.

new version that is how the current version of MIRI works

old version that explains the concept of stacked borrows

2 Likes

Thanks for the links! Very interesting reads.

But afaict MIRI is just a (WIP) proposal how the rust memory model might be defined, and "reborrow" a term from it.

I.e. saying code is safe just because it passed MIRI seems a bit risky to me.

I'm guessing one might say that although the current rust specification doesn't explicitly allow it ("it" being the examples I linked), it works for now and future specifications will allow it with a high probability.

cc @RalfJung

2 Likes

I am not saying that just because it passes MIRI it is safe, your code is just doing a reborrow, so it is safe, and MIRI seems to agree, which is a plus. If you replaced all of the raw pointers with references, then your code would compile, and that is proof that it is sound.
Also, MIRI works perfectly fine for these small cases, it's only for more complex programs that I would start to doubt MIRI if it passes.

No matter whether you use references or raw pointers, a compiler is not allowed to reorder instructions if it means changing the semantics of the program (thus, for instance, your example program will always output 10).

The only reordering issue is when there are "multi-threaded data races", since they involve a thread's view of the world / memory not always being in sync with another's: that memory decoherence is the only time visible reordering-like behavior may seem to happen.

That's why, as long as you don't break Rust aliasing guarantees, you are free to use raw pointers; it's just that without lifetimes you won't have Rust checking your back that those guarantees are respected.

Sorry, but you didn't get it - if it is UB, the compiler can reorder. And so far nobody had any reference that claims it isn't UB. @RustyYato keeps talking about MIRI (and "reborrow" like MIRI defines it) like it is the reference, but I can't find any evidence this is actually the case.

There currently isn't an official reference, but work us being done to make one in the Unsafe Code Guidlelines Working Group.

For now MIRI is the best way to verify things other than asking an expert, like RalfJung, the maker of MIRI.

Also, your code is identical to

let mut x: u32 = 0;
let r1 = &mut x;
let ptr: &mut u32 = r1;
{
    let r2 = &mut *ptr;
    *r2 = 5;
}
*r1 = 10;
x

And this compiles, so it is sound. Replacing references with pointers does not change if a program is sound.

3 Likes

Disclaimer: As mentioned above, there is no decision yet on what the memory model should be. So I can basically only answer this by saying what Stacked Borrows has to say here, which as you noted is WIP, un-RFCed, and basically just my "hobby project". But it's also the best thing we got currently.

No. Lifetimes don't matter for the behavior of the program. They cannot, we erase lifetimes aggressively during compilation.

Very good question! The answer that I am proposing with Stacked Borrows is that the "lifetime" of the "borrow through the raw pointer" ends the moment the pointer from which it got created (x in your case) gets "used" again. So in your example:

let mut x: u32 = 0;
let ptr: *mut u32 = &mut x; // "lifetime" of raw ptr starts
unsafe { *ptr = 5; }
x = 10; // "lifetime" of raw ptr ends
x

This is why the compiler is not allowed to reorder here.

Note that "use" is a fuzzy term, and currently Stacked Borrows considers many things a use. Hence this is UB:

let mut x: u32 = 0;
let ptr: *mut u32 = &mut x; // "lifetime" of raw ptr starts
let y = x; // x gets "used", so the lifetime of the raw ptr ends
unsafe { *ptr = 5; } // UB
x

It is. And certainly, in general, Miri is not in the state where we can guarantee anything like it detecting all UB.

However, your particular pattern seems to me like something any "reasonable" model would allow. I think a lot of code would break otherwis

While I am one of the maintainers of Miri these days, I wouldn't consider myself its "maker"; see the README for some more historic details.

Also, Miri is certainly a helpful tool, but shouldn't be used without consideration. So if after carefully thinking about your code you think you followed all the rules, and then you run it in Miri and Miri gives you a green light, then that's a good sign. But if you just do random stuff until Miri no longer complains, that's not verifying much of anything.

This is a great observation! I think it needs to be bold and italic:

Replacing references with pointers does not change if a program is sound.

I think this is a minimal standard we should require for any model. That's in fact kind of what I meant above by saying "any reasonable model would accept this code".

5 Likes

Thanks @RalfJung and @RustyYato for your answers!

"Lifetimes don't matter for the behavior of the program." - usually I understood that as "no overloading for 'static".

Saying that the compiler must not use lifetimes for optimizations seems a quite different restriction - and I'm guessing this is right now more of an implementation detail than specification.

This does sound like a very easy and obvious rule, but I fear it will limit what futures compiler versions can do a lot (hard to proof without having such compiler though; maybe good alias analysis is all we need).

Anyway, thanks again for all your input!

I hope it does! It should limit the compiler not not break all the code. :slight_smile:

Ok, my bad.

That is what I meant, just didn't word it correctly, sorry anout that.


Thanks for commenting on this thread!

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.