Isn't a pointer cast just a more dangerous transmute?

Consider the following program:

use std::num::NonZeroI32;

fn main() {
    let mut x = NonZeroI32::new(5).unwrap();
    let p = &mut x as *mut NonZeroI32;

    let wrong_ptr = p as *mut [u8; 4];

    // Safety:
    // - `wrong_ptr` is valid for writes
    // - `wrong_ptr` is properly aligned
    unsafe { std::ptr::write(wrong_ptr, [0; 4]) };
    // ^-- This line is UB that is impossible to detect with local reasoning

    println!("{}", x);
}

The safety documentation for std::ptr::write reads

Behavior is undefined if any of the following conditions are violated:

  • dst must be valid for writes.
  • dst must be properly aligned. Use write_unaligned if this is not the case.

Note that even if T has size 0, the pointer must be non-NULL and properly aligned.

None of these conditions is violated in this program, yet UB is introduced, because x now holds a zero value.

I understand that pointer casts cannot cause any problems without unsafe code, but to me they seem like a delayed transmute happens to be possible in safe code, even though transmutes are "the most horribly unsafe thing you can do in Rust".

Am I missing something? Is there a way to detect the UB in the above program given the local reasoning from the safety documentation?

I guess the problem in this program is that wrong_ptr and x alias and have different understandings of the constraints that the pointed to data has.
Maybe something like that should be added to the safety documentation of std::ptr::write?

1 Like

I mean, you can transmute it instead of casting it and have the same issue. Transmute is considered more dangerous because it doesn't verify that you are actually transmuting between two pointer types.

1 Like

Pointers have inherently nonlocal properties.

In this case, the pointer is to a memory location typed at NonZeroI32. This means that the memory target of the pointer cannot store the bit value 0x0000_0000.

This "extra information" attached to the pointer is folded into the "provenance" of the pointer. The actual act of writing 0 through the pointer is not UB itself, it's the writing 0 to a memory location typed at NonZeroI32 that is the UB.

For example, if the memory location were Option<NonZeroI32>, it would be perfectly legal to write 0 into a pointer to that memory slot, even though nothing is different about the pointer's bits. (Whether writing 0 to the NonZeroI32 from inside the option is UB is an open question; I say UB, and that's the safer assumption ofc.)

Also of note is that this is only instant UB because the written memory is a typed stack slot. If it were a heap memory location, then it's only UB once the value is accessed at a type the bits are no longer valid for*.

* Well, it's an open question whether references have a validity requirement for their target pointee to also be valid, but the UWG is leaning towards it being a safety invariant on references (that's active even on things like a typed copy of the reference).

Indeed, but transmute is an unsafe function advertised as "incredibly unsafe", while a pointer cast is a supposedly safe operation.

How would someone who just looked at the safety documentation of the only unsafe function used in this example find this out then?
Should there be a "dst must not have non-local properties which are being violated by this write" condition in the safety documentation of std::ptr::write and similar functions?

Producing is the doc section you need to be referring to.

Given your using literal it is theoretically possible a compiler could deduce UB.

I was thinking MaybeUninit was(is/maybe/pick-a-verb) still/now required for heap. Certainly better having, to be on the safe.

This doc section states that

"Producing" a value[1] happens any time a value[2] is assigned, passed to a function/primitive operation or returned from a function/primitive operation.

I've marked two mentions to "a value" with numbers, because it's not clear whether they refer to the same value.

In this example (assuming the println statement didn't exist) x is not assigned, passed to a function/primitive operation or returned from a function/primitive operation after the line that caused the UB, so [2] cannot refer to x.

Does that mean that

  • this code would be safe, if the println statement didn't exist? What about a NonZeroI32Wrapper that implements Drop and uses the value in its Drop implementation then?
  • the "producing" a value does refer to [2], but it can "time-travel" back before the UB occurred? Would that mean that the UB is technically the original assignment to x and not the std::ptr::write call?
  • [1] and [2] may refer to different values? So technically any time any value is assigned, passed to a function/primitive or returned from a function/primitive operation every other value that exists in the program must be valid or else it would be UB?

Which of these is the actual cause of the UB in this case? I'd suspect it'd be the third option, also because of what @CAD97 wrote about the nonlocal properties of pointers.

The code would still be UB even if you didn't read from the nonzero integer after assigning it zero.

Yes, UB can time travel. As long as it is guaranteed that your program will eventually invoke UB, miscompilation can cause issues before the UB "happens".

And yes, all values must be valid at all times.

The cause of UB is in this case the ptr::write call.

1 Like

The pointer cast is absolutely a safe operation. It's the later pointer dereference that is also incredibly unsafe -- 123 as *const usize is also safe code.

4 Likes

Pointer casts when combined with ptr::read/write or deref are capable of doing everything that transmute can do, so in this sense they are as powerful (and as dangerous) as transmute. For example, you can implement transmute (except for the static size check) using a pointer cast plus a read:

unsafe fn transmute<T, U>(t: T) -> U {
    let p = &t as *const T as *const U;
    mem::forget(t);
    ptr::read(p)
}

However, this is really a consequence of the basic property of raw pointers: The compiler does not check or guarantee their validity in any way, and therefore anything you do with them is incredibly unsafe. For example, raw pointers also allow you to implement transmute without any pointer casts!

#[repr(u8)]
enum Either<T, U> {
    T(T),
    U(MaybeUninit<U>),
}

unsafe fn transmute2<T, U>(t: T) -> U {
    let mut e = Either::U(MaybeUninit::uninit());
    let p = match &e {
        Either::T(_) => unreachable!(),
        Either::U(u) => u.as_ptr(),
    };
    e = Either::T(t);
    mem::forget(e);
    ptr::read(p)
}

Note: In both cases, ptr::read is the only line that requires unsafe. Everything else can be done in safe Rust.

[Playground with all the code from this post.]

6 Likes

I just noticed that my transmute2 function has undefined behavior according to MIRI, because the “Stacked Borrows” model says that mutating e invalidates the pointer p. Now I’m not sure if there’s any supported way to transmute using raw pointers without casting.

(On nightly, you can implement transmute using the experimental untagged_unions feature, but this doesn’t require any raw pointers.)

3 Likes

I guess what confuses/worries me is that the transmute documentation states its "incredibly unsafe" and should be the "the absolute last resort", whereas the same things are possible with ptr::read/write (potentially even introducing non-locality, where transmute would have local effects) and the documentation of these is fairly short and benign in its wording.

3 Likes

At a guess, this difference is more a reflection of how easy the technique is for beginners to find— Most people looking at ptr methods are trying to do things like FFI or lifetime hacking rather than byte pattern reinterpretation.

1 Like

This is a case where there's a disagreement between informal interpretation of unsafe {} as "tricky code, pay attention!" and Rust's formal definition of unsafety focused on UB and UB-causing loopholes.

So while pointer casts aren't unsafe by themselves to the compiler, they absolutely are a place in the code where mistakes can lead to safety issues.

3 Likes

Yeah, you'd need to use / exploit the uniqueness in order to derive a mutable pointer:

unsafe fn transmute<T, U>(#[unique_aware] t: T) -> U {
    let p = &uniq t as *mut T as *const U;
    mem::forget(t);
    ptr::read(p)
}

i.e.,

unsafe fn transmute<T, U>(mut t: T) -> U {
    let p = &mut t as *mut T as *const U;
    mem::forget(t);
    ptr::read(p)
}

Also, to nitpick w.r.t. soundness, mem::forget can lead to Stacked-Borrows-model UB, so it is better to try and always replace forget usage with ManuallyDrop:

unsafe fn transmute<T, U>(t: T) -> U {
    let mut t = ::core::mem::ManuallyDrop::new(t);
    let p: *const U = &mut *t as *mut T as _;
    p.read()
}

// or one-liner: <*const U>::read(&mut *ManuallyDrop::new(t) as *mut _ as _)

And back to the topic at hand with this manually hand-rolled transmute (or one using unions): indeed, casts and unions can be more error-prone than direct usage of transmute, since the latter requires that T and U be equal sized, which neither as pointer casts nor unions do.

3 Likes

But it's not the pointer cast that causes the UB here, either. It's the act of writing an invalid value. The pointer cast in itself is still memory-safe, but you are subsequently using the pointed value incorrectly – that's not the fault of the pointer cast.

I am well aware of that. Perhaps the word "supposedly" was not well chosen.

What I meant was that in pretty much any real world program where a pointer cast is performed, that is done with the intention of actually using the pointer.
And the pointer cast is what enables misuse of the pointer in most situations, arguably in potentially worse ways than a transmute, because of the non-locality that is introduced with pointer casts.

But yes, theoretically a pointer cast is completely sound and can never cause UB by itself.

Pointer casts are nonexistent, because there are no types at the binary level. Type safety is a property of programming languages and casting a value to another type is simply a way to make the compiler believe, that the value its looking at is something different. The only way to cause undefined behavior is by interacting with a value in a way, that it doesn't support. Due to aggressive compiler optimization, just because you don't explicitly interact with a value doesn't mean, that the compiler may not insert code by itself that interacts with the value, that causes UB. Having 2 or more mutable borrows (&mut) to the same value is one of those examples, that is instantly UB due to potentialÂą compiler optimizations.

Âą AFAIK, LLVM doesn't perform any magic regarding &mut, yet, but it is supposed to be doing that in the future, i.e. at the moment you would most likely not notice your program behaving in a way you don't expect, but compiling the same program in the future with a newer underlying version of LLVM may suddenly cause a different behavior than it did previously.

1 Like

I'd argue that what happens at the binary level is fairly irrelevant in this discussion, as "What The Hardware Does" is not What Your Program Does.

The compiler will never insert code that causes UB without some other UB existing somewhere else.
UB is what enables these aggressive optimizations in the first place.

So a pointer cast cannot cast UB by itself, because the compiler will never randomly insert uses of that pointer.

2 Likes

The trick with "with the intention" is that it can make almost everything dangerous. It would be possible to argue that, for example, that it's the subtraction that introduces the problem in something like a.get_unchecked(i-1), but we also wouldn't use that to say that subtraction needs to be unsafe.

1 Like

Having 2+ &mut pointing to the same memory region causes UB, regardless of you dereferencing the pointer, because the compiler operates on the assumption that no one else will be observing the value, i.e. for the duration of the lifetime of the mutable borrow, the compiler can do whatever it wants to do with the value. This is what I meant with "[…], just because you don't explicitly interact with a value […]".

Sorry, for not having elaborated on this, which caused the misunderstanding. We don't disagree on UB. I'm just looking at it from a different perspective.

1 Like