Can someone explain this miri error: encountered a pointer but expected plain (non-pointer) bytes

I'm trying to understand some of the rules around transmute but I'm seeing stuff that seems contradictory. Here's an example program that I'd like to be able to write soundly, but currently fails Miri:

fn main() {
    let pointer: Box<i32> = Box::new(123);
    let p_bytes: [u8; size_of::<Box<i32>>()] = unsafe { transmute(pointer) };
    let pointer_again: Box<i32> = unsafe { transmute(p_bytes) };

    assert_eq!(*pointer_again, 123);
}

Running this with miri gives the following:

error: Undefined Behavior: type validation failed: encountered a dangling box (address 0x28ca8 is unallocated)
 --> src/main.rs:7:44
  |
7 |     let pointer_again: Box<i32> = unsafe { transmute(p_bytes) };
  |                                            ^^^^^^^^^^^^^^^^^^ type validation failed: encountered a dangling box (address 0x28ca8 is unallocated)
  |
  = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
  = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information

Unfortunately the link doesn't help much with this question.

The general question I have is: "why is this UB". Googling this error message hasn't been particularly successful, but it generally indicates that this might be an issue with pointer provenance, but I was under the impression that pointer provenance rules currently aren't enforced by Rust. Is that correct? Or is miri being overly cautious here? Is there actual risk of something going wrong?

This is confusing me because the docs of transmute seem to have a few places where pointer-like types are transmuted (e.g. Vec, &str, &[u8]).

Moreover, if I make p_bytes have type usize, I get a different error (for a dangling box when doing the second transmute), which to me implies that the first error was fixed. But this is also confusing. Does passing pointer into transmute run its Drop implementation when transmute ends (since it passes ownership)?

Thanks in advance :grin:

Transmuting pointers is a red flag. It's almost never sound. Transmutation is not transitive: if transmute between T and U is valid, it doesn't imply that it's valid between &T and &U or Box<T> and Box<U> or anything like that. This is because transmutation deals with values, so when you transmute pointers, you are converting between the values of those pointers themselves, without regard to what they point to.

I'm curious as to what purpose this serves. I suspect what you are actually trying to do here is transmute the pointed value (i32) to another type ([u8; 4]), but do it indirectly instead of by-value. For that, you can just cast the pointers instead of transmuting, which will preserve provenance.

"enforcing" constraints doesn't mean that you get a runtime error. Miri is merely a tool that tries to make detection of UB more user-friendly. If you violate provenance rules, it's still UB even outside of Miri, it's just perhaps less apparent.

No. Drop doesn't run when you "pass ownership", it runs when the lifetime of a value ends. Transmute gives you back the same value that you passed, so it gives up ownership immediately after acquiring it.

It also pretty quickly gets to the several "don't do this" parts, with an illustration of the various creative ways of getting UB via pointer transmutation.

In particular, for Vec, it specifically says:

// Using transmute: this relies on the unspecified data layout of `Vec`, which is a
// bad idea and could cause Undefined Behavior.
// However, it is no-copy.
let v_transmuted = unsafe {
    std::mem::transmute::<Vec<&i32>, Vec<Option<&i32>>>(v_clone)
};

So don't just skim through the "interesting" parts of the code if you are trying to scrutinize the memory model. Read the prose too, because interpreting a "don't do this" example as a "you can do this" example is counter-productive.

5 Likes

I'm curious as to what purpose this serves

The actual code is serializing a struct to a byte slice, to conform an API that only accepts byte slices. This was seen as a potential performance optimization to only serialize the pointer, and reconstruct in the trait implementation. The actual struct has some small fields, then a Box<[u8]> that contains quite a lot of data, so the original thinking was to avoid serializing the contents of the box and instead just the pointer.

In particular, for Vec , it specifically says

Yep, the vec example was bad, but there are places where it seems to suggest transmuting pointers is OK, specifically converting function pointers to and from *const () (which miri doesn't complain about).

"enforcing" constraints doesn't mean that you get a runtime error

This was bad wording on my part. My understanding of pointer provenance was that violating the rules wasn't UB, but just made optimizations harder. Is that wrong?

Thanks for the detailed reply

I believe it is wrong indeed.

Typically, it is not the case that the compiler can actively check for these rules and then just not optimize if it believes they are being violated – exactly because these constraints are impossible to verify at compile time in the fully general case. (This is why we need Miri for detecting them at all.)

Consequently, all the compiler can do instead is assume you uphold these guarantees, and then optimize in good faith unconditionally. Therefore, if you violate such contracts, you will automatically get UB pretty much by definition, as you just violated a rule that was assumed to be upheld.

Right, so it is exactly what I suspected: you are not trying to transmute the pointer itself – what you need are the contents of the pointed value as a byte slice. Therefore, you should simply cast (using as or .cast::<_>()) the pointer. Better yet, use the bytemuck crate which does it in a more principled manner, encapsulating unsafety.

4 Likes

Got it, that makes sense.

Thanks for the explanation :grin:

I believe, the pointer provenance restrictions are only weakened if you do the ponter<->usize casts explicitly with an as; transmutes don’t count. I.e. if you start with Box::into_raw(pointer) as usize, then you may transmute that usize into a u8-array and back to usize, so you can end in Box::from_raw(… as *mut i32) and make both pointer<->usize casts explicit.

2 Likes

No. It's not correct. That's what I got from one LLVM developer when we discussed the sad lack of -fno-provencnce flag in clang:

Rust compiler, in some cases, explicitly forces LLVM to not care about provenance rules, but since LLVM violently believes in it… rustc definitely couldn't turn it off.

And pointer-integer casts are precisely the ones that you shouldn't do. Well… more precisely: these are things which are still not fully specified by both C++ and Rust (in C++ is's slightly worse than Rust: while Rust deference guide says that it's something not-yet-specified C++ standard actually does specify everything… only compilers don't follow the standard in that place).

It's… complicated. Let me summarize the situation as it is now.

  1. Pointer proveanance is not a property defined by C/C++ standard.
  2. Compilers are explicitly allowed to use pointer provenance for optimizations.
  3. LLVM (and GCC) violently believe in the provenance.
  4. The fact that #3 breaks valid C/C++ is “not a problem” since there is proposal which would make these valid programs invalid if/when it would be approved. The fact that proposal is not yet approved (after two decades of development!) doesn't stop the compiler writers.
  5. Rust compiler is forced to deal with all that mess… somehow. Because we only have one production-quality compiler and it's LLVM-based (GCC based wouldn't save anyone since GCC also violently believe in the provenance).

That's, basically, where we are today.

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.