Problematic integer-to-pointer transmute

Rust doc at https://doc.rust-lang.org/std/primitive.fn.html#casting-to-and-from-integers mentions:

Crucially, we as -cast to a raw pointer before transmute ing to a function pointer. This avoids an integer-to-pointer transmute , which can be problematic. Transmuting between raw pointers and function pointers (i.e., two pointer types) is fine.

I would like to know what could possibly go wrong in an integer-to-pointer transmute, and why an intermediate as-cast solves the issue.

as casts are the compiler-supported way to do the cast, that's why they work. The compiler would like to know all places that you possibly get access to the target of your pointer for optimization reasons. If you hide this information using transmute, this compiler analysis could go wrong, which is why you aren't allowed to do it that way in the first place.

As for why pointers are more complicated to reason above (for programmers as well as for compilers) than simple integers, I think this article would be a good read, and this follow-up as well. And this third part, too, perhaps. They do cover pointer to integer casts, too, IIRC.

As for why, via as casts, it's possible to turn pointers into integers anyways (as long as you do it in this compiler-supported way): As far as I know this can be thought of releasing the pointer into some global pool of pointers that you are allowed to access through any integer cast back into a pointer (also via as) - this comparison is not to be understood as an implementation strategy but as a metaphor as to how the compiler can then analyze who can access this pointer (answer after you cast it into an integer: basically anyone) which, as mentioned above, is relevant information for optimization purposes (more code being known to potentially have access the pointer target means less possibility for optimization based on local reasoning, at least as soon as any calls to unknown code are involved).

The TL;DR without giving an explanation for "why" would simply be: because the language defines transmuting between integers and pointers (and then dereferencing the result, and perhaps except for zero-sized types, which may be fine) as undefined behavior.

It should also be straightforward to observe this when running some test code examples through miri.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.