Unsafe code, flush or barriers to avoid undefined behavior from multiple *mut T?

I've been told that the Rust compiler can perform optimizations which assume that there is only one &mut T in the world, so if you write unsafe code that violates this, you can get bad undefined behavior, nasal demons, etc.

Let's say I have multiple raw pointers (*mut T) floating around, and I use one to perform a mutation on the T. Is there something I can do to limit the possibilities of these kind of UB bugs? Eg, something like:

let raw_ptr: *mut Thing = ...
let mut_ref: &mut Thing = unsafe { &mut *raw_ptr };
mut_ref.num = 2;
flush_writes_and_dont_optimize_past_this();
...

I think these things have names, but I can't remember what they are. Maybe a "memory barrier"?

Memory barriers are a runtime-only concept, and mutable aliasing is a compile-time-only concept. They don't interact.

There's nothing supported for containing Undefined Behavior. You're not allowed to have it anywhere at any time at all.

However, Rust doesn't mind having multiple *mut pointers existing in memory. Rust is pretty relaxed about raw pointers, and you can treat them the same way as C pointers (as long as you don't create references from them, see std::ptr for useful methods).

Raw *mut pointers can co-exist with &mut, as long as you don't use them both at the same time. Rust also has a concept or re-borrowing which means there can even be multiple &mut references to the same object existing in memory at the same time, but only one of them is allowed to be used at a time. This is enforced with lifetimes and borrow checking at compile time, there's no code for it at run time. The exact rules how multiple mut pointers and references can be used with unsafe code are a complex topic, and not all cases are fully specified. See "stacked borrows"/"tree borrows", and Miri.

The problems with UB and aliasing aren't from run-time behaviors or explicit checks. They're from assumptions hardcoded in the compiler and the optimizer. Compiler "knows" that all writes through &mut are only possible through that reference and the compiler doesn't need to take into account any other possibility, so it won't, even if code with UB breaks that assumption. So there won't be a thing you can put in your code to change the assumptions in the design of the compiler. You can prevent code optimizations, which may make mismatched assumptions less of a problem, but that's not the same thing.

If you need multiple pointers to the same memory, first try using Rust's types for it like Cell, Atomic*, or Mutex. If you have a unique case, see UnsafeCell

2 Likes

Practically, your flush_writes() is written drop(mut_ref). This does not actually have any side effects, nor establish an optimization barrier, but it does prevent you from continuing to use mut_ref incorrectly and signals your intent to readers of the code.

2 Likes

Just use the raw pointers. Avoid creating references unless you really need to.

In your example, you can use your raw pointer to do the write, there's no need to create a &mut Thing from it.

let raw_ptr: *mut Thing = ...
unsafe {
    (*raw_ptr).num = 2;
}

I know this isn't really the point of the thread but: The way to contain undefined behavior is to run code you trust sufficiently little in a separate process.

I think my only motive there is syntactic aesthetics. foo.baz() is nicer to read than (*foo).baz(). But this effects more my use of shared refs, since those tend more to have repeated use in a short block of code. Eg: reading a few properties from a struct.

let t: &Thing = unsafe { &*raw_ptr };
println!("x, y = {}, {}", t.x, t.y);

If you're using unsafe a lot, consider trying to make an abstraction that enforces the rules to guarantee safety, and allows accessing the references using safe code alone. But you haven't said why you're using unsafe, so I don't know whether that would be practical.

1 Like

I'm just starting to experiment with it, so I can't tell you yet exactly how I may end up using unsafe code. I'm trying different ways to write retained UI (widget trees, event handling) which so far has been awkward.

In a situation like that, you definitely should be seeking a safe abstraction — whether one that already exists or one that you design to solve your particular problem. You don't have to do it up front for a prototype, but it is highly likely that if you don’t do it at all, your code will be unsound. UI code and data structures tend to have complex details, and interact with multiple parties (the application data model it presents, incoming user actions, and the UI framework’s own mechanisms), so it’s easy to miss a bad interaction if nothing is either statically or dynamically enforcing only valid accesses.

One place to start may be to use Rc<RefCell<T>> (or, depending on how you view the problem, RefCell<Rc<T>>) in places where you find uses for raw pointers. This means that your accesses are always checked, and the overhead of RefCell is almost certainly completely insignificant for the operations of a retained-mode widget system.

1 Like

Beware, using objects by dereferencing raw pointers creates references! It will create a temporary &mut to assign to a field, and create references for methods where necessary. If you overwrite fields with Drop types, it will run drop too.

There's ptr::write for just writing memory though a pointer without the rest of Rust's semantics.

3 Likes