Casting borrows from immutable to mutable - how bad is it?

When you're casting away immutability/mutability from borrows with something like this:

fn some_method(&self, something: &Thing) {
  let mut now_mutable = as_mutable(something);
}

fn as_mutable<T: ?Sized>(val: &T) -> &mut T {
    unsafe { (val as *const T as *mut T).as_mut().unwrap_unchecked() }
}

How badly does the compiler mangle this code behind the scenes? (It seems to work fine in debug, haven't tried with release mode)

Considering that the path from something to now_mutable is essentially a straight line with no real indrection or branches I'd assume it just silently disables optimizations it might've been able to do if something was provably immutable.

Tangentially I'd also be curious about the same for multiple mutable aliases out at once. Considering you can mutate pointers at-will and borrows are just pointers with special meaning I'd hope it dosen't savage things too badly.

In the documentation both of the above are described as undefined behavior but are things you can do happily by just using raw pointers. So is it actually undefined behavior or just 'do this at your own risk'?

I'd hope it'd still compile and run correctly, albiet slightly slower.

(I'd love to see where in the compiler this causes issues if it does too!)

Why do you want to do this? Any code that actually uses this is almost certainly exhibiting undefined behavior.

1 Like

To quote from the nomicon:

  • Transmuting an & to &mut is Undefined Behavior. While certain usages may appear safe, note that the Rust optimizer is free to assume that a shared reference won't change through its lifetime and thus such transmutation will run afoul of those assumptions. So:

    • Transmuting an & to &mut is always Undefined Behavior.
    • No you can't do it.
    • No you're not special.

Raw pointer casts and unions do not magically avoid the above rules.

So your function is always undefined behaviour, because it is changing the restrictions on a reference in a way that breaks those restrictions.

10 Likes

Miri says that similar code has Undefined Behaviour, which is shorthand for "the compiler's output for this code is no longer predictable".

The reason UB is considered a problem is that the compiler doesn't know which optimizations will do the wrong thing in the face of any given instance of UB; as a result, you're going to be fine until such point as the compiler invokes an optimization that is always correct in the absence of UB, but incorrect if your code contains this specific UB.

For example, the compiler is allowed to optimize your code under the assumption that there is at most one exclusive/mutable borrow of a variable, and that if there is an exclusive/mutable borrow, there are no shared borrows. There may be no optimizations that apply to your code today for the specific target platform you have in mind; but that's not guaranteed to stay the same as the compiler evolves or as you change either target platforms or your code.

And remember that the compiler doesn't know which optimizations it has that depend on those assumptions - it just knows that it won't let you break those assumptions without writing the magic word unsafe, and that it can assume that if you wrote the magic word, you've taken care to not break the assumptions the compiler depends upon.

2 Likes

That's one of the worst kinds of UB, where you easily get code that blatantly does not do what the source code says.

This prints 5 in release mode:

fn as_mutable<T: ?Sized>(val: &T) -> &mut T {
    unsafe { (val as *const T as *mut T).as_mut().unwrap_unchecked() }
}

#[inline(never)]
fn set_to_10(val: &i32) {
    *as_mutable(val) = 10;
}

#[inline(never)]
fn get_val() -> i32 {
    let val = 5;
    set_to_10(&val);
    val
}

fn main() {
    println!("{}", get_val());
}

playground

Do you want to debug programs with this kind of bug? I certainly don't. If was debugging code, and I found out that the bug was due to something like the above, then I would get pretty annoyed at the author of the code - it would probably have taken me a long time to track down the issue. So don't do it.

15 Likes

If the documentation says it is undefined behavior, then what's the question? If it's explicitly described to be UB, then it absolutely, definitely, 100% is.

(You might be confusing UB with "crash". That's a common misconception — code with UB is not guaranteed to crash or do anything obviously wrong.)

3 Likes

When Rustaceans see this cast in a project, they immediately realize the canary in the coal mine died and it’s time to leave or they’ll run out of oxygen.

7 Likes

It gets worse. If I change your main to:

fn main() {
    let inlined_get_val = 5;
    set_to_10(&inlined_get_val);
    println!("{}, {}", get_val(), inlined_get_val);
}

then I get "5, 10" back. So it's "obvious" that set_to_10 does what you expect, since in the tiny test case in main, it does set inlined_get_val to 10. Some debugging later, and you'd realise that it does something different in get_val…

5 Likes

Yeah I would've only used it if it actually worked. I was more interested if this was more nuanced than just 'yeah, broken', but that looks to be the case. Seems Rust cleaves real clost to its rules with little deviation allowed.

Thanks either way.

FYI, if you want to be able a achieve similar effect, i. e. habe mutable data behind a shared reference, using unsafe code, but without getting UB, you might want to try UnsafeCell, or perhaps one of the same abstractions built on top of it (in which case you can even avoid the need for unsafe, if one of those fits your use-case). The standard library has Cell, RefCell, Mutex, RwLock, and soon also OnceCell and OnceLock.

5 Likes

You cannot actually mutate using pointers, at will, entirely. On one hand, Rust, a pointer must be created in a way that allowed mutable access in the first place. On the other hand, if you copy mutable pointers, you still must not (in the sense that doing so anyways will be UB) create either of the following conflicting concurrent kinds of access:

  • A non-exclusive mutable reference, i. e. any mutable reference that exists while the pointer's target is accessed in any other way at the same time,
  • a mutation while an immutable reference exists, i. e. while an immutable reference you created from your pointer is alive, the pointer may not be used for any mutation (ignoring motation that happens safely contained inside an interior mutability primitive that uses UnsafeCell),
  • a data race, i. e. a write operation happening in parallel (unsynchronized, in a different thread) to a read operation or another write operation.

Especially the last point is a good example of requirements that don't involve any usage of references at all; data races in Rust are UB, and if you want unsynchronized mutable access to a value from multiple threads, you need to use atomic operations, e. g. via the various Atomic… types in the standard library.

The above does e. g. not address the pattern of “splitting borrows”: E. g. if a pointer refers to a larger struct, or array, etc. then conflicting access can only occur if the same (or overlapping) parts of the struct or entry of the array is accessed in one of the conflicting manners as listed above.

(Also, as always with such listings off the top of my head, I might have possibly forgotten cases or not explained relevant details.)

3 Likes

The Rust community has culturally rejected "if it compiles ok it's good", and the teams have expressed that they're against creating language dialects by providing ways to change the exclusivity guarantee.

3 Likes

That's actually pretty interesting about the pointer creation rules, thanks! I guess I haven't used a raw pointer outside of my various smart-pointers so got that mixed up with using them naturally.

Yeah I've been delving around those structs a little bit, including Rc, Arc, Cow (great name) etc trying to find something that fits what I needed to do. I was tinkering around with using the above to selectively dis-apply certain compiler rules where needed but wasn't convinced that it was safe enough, even if I saw a pathway to it maybe being OK. At least that got answered, though I might delve around the compiler a little to see if there's a technical reason due to how it views the world or its just a conceptual/philosophical decision.

I'll have a looksee at OnceCell and OnceLock too actually, even if they may not be usable yet. Might be neat.

They are in the process of being stabilized, due to be part of the release 1.70 that comes out in about 17 days.

You can also check out the popular crate these were adapted from: once_cell - Rust, OnceLock is called sync::OnceCell there, and the Lazy types are also sometimes very useful, in particular for initializing static variables at run-time.

There are both technical and political reasons. You may not believe that, but the whole story goes as far back as to 1988 (and, most likely, earlier).

Compiler writer always wanted that guarantee that the reference they observe is one and only active reference that points to a particular place. They invented many things like noalias (which was shot down as being unsable), then restrict, pointer provenance and many, many, MANY other things to [try to] get that property.

Now, when Rust finally have found way to give them that “holy grail” on a silver plate… do you think they wouldn't use it?

In a sense you were asking about whether it's Ok to take a single brick which underpins the whole 100 stories construct… isn't it a bit deal… it's just such a tiny thing…

But, well, we are not talking about some simple rule here, but about culmination of decades of research… in some sense the whole point of Rust is that it's language where that thing that you were trying to do is impossible (albeit in normal, “safe”, Rust compiler ensures this doesn't happen while in unsafe Rust it's responsibility of the software developer).

Of course it's violation wouldn't be taken lightly (cautionary tale which made me actually start playing with Rust in real projects… ironically enough it was precisely cavalier attitude of Nikolay Kim to things like these that prompted community to kick him out). Some other rules may be debatable, but this… nope. Not open to debates.

2 Likes

Related fun fact since normally you don't have such guarantee on C++:

Of course Rust solved this problem. VoilĂ !

2 Likes

And there's a subtle exception-safety issue in that screenshot. if condition( i ) or process( i ) throws an exception, then outValues is left empty, and everything it contained is destroyed. Doing C++ well is surprisingly hard…

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.