Best way to cast Option<&T> to *const T

I was looking for a convenient way to cast Option<&T> to *const T (same for &mut). I thought there is a shortcut for this in std, but I didn't find anything.

The simplest thing I came up with is this, which has its drawbacks:

fn f(x: *const i32) { /***/ }
let x: Option<&i32> = Some(&12);
f(x.map_or(std::ptr::null(), |x| x));
//                   ^^^^    ^^^^^ raises questions "why?"
//                   requires import or long path

For clarity, this can be written like this:

f(x.map_or(std::ptr::null(), |x| x as *const i32));

But in the end, casting is quite tedious. Is there any easier way to do this?

1 Like

Well, from_ref in std::ptr - Rust would help, but it's not stable yet.

That'd made it x.map_or(ptr::null(), ptr::from_ref).

1 Like

Side note, discoverability is pretty poor here. You can't find that in that pointer documentation, and the documentation for freestanding ptr functions doesn't include the parameters or anything else except the function name. If you don't want to click on each one, but still want to do things like search for parameters or return values, your best bet is to click on one of them and then click Source (and hope they're all defined inline). (Clicking Source on the std::ptr page itself takes you someplace unhelpful too.)

1 Like

std has opposite functions as_ref and as_mut. But why is there no convenient function for such an operation? Might be worth adding something like this?

I’m pretty sure majority will blame me, but I’d just use core::mem::transmute since None::<&i32> is a null pointer because of NPO. Anyway, compiler is smart enough to optimize out all the code in both cases (does nothing if inlined, just passes the argument as result if not).

I commented over here, but in short -- DSTs like dyn Trait and future custom DSTs present a challenge: they must have valid metadata.

Transmuting None is also highlighted as UB in the docs. As far as I'm aware, there's no guarantee that the None variant is Null for Option<&T>, and that's probably why. Most &T have larger niches than that (due to alignment and non-zero size). That note comes from Ralf.

Wide references/pointers additionally have no guaranteed layout match in terms of the order of the data pointer or metadata pointer (between &dyn and *const dyn), and metadata might be used for the niche instead of the data pointer.

4 Likes

Because it's not something that commonly needs to be done.

1 Like

Fortunately, you’re a little wrong. I tested this on slices too. Compiler uses the pointer as a discriminant without requiring the metadata to be null.

There's no way to give a technical reply without more details about what exactly you're calling wrong or what you tested, but speaking generally -- if by tested you mean "compiled and got the behavior you hypothesized", that's one possible outcome of undefined behavior (which is part of why it is so insidious), and doesn't mean anything as far as language guarantees go.

To be precise, the linked section only calls out transmuting None::<T> to T out as UB in these cases. T would be &i32, so all that is claimed is that transmuting Option::<&i32>::None to &i32 is (very much unsurprisingly) UB.


That being said, transmuting Option<&T> to *const T when T is !Sized is most likely always undefined behavior for None values, as far as I’m aware, so I agree on that point. Even if the Option would reproducibly use the pointer’s niche, and the the order between pointer and metadata was guaranteed to be the same between &T and *const T, then there’s still the problem that the metadata would be initialized with uninitialized data, which is very much problematic both for the usize metadata of slices, and also for the vtable-pointer metadata for trait objects.


For T: Sized, I thought – as far as I remember[1] – that Option<&T> is considered FFI-safe and you’re supposed to be able to use it in a function signature where the C side uses an ordinary *const T (aka “T*” pointer), so that would indicate that the transmute Option<&T> to *const T might actually be intended to be guaranteed to be safe for T: Sized types, even for the None value.


  1. I tried to find documentation for this, but haven’t so far – I do remember that compiler messages will yell at you though for putting non-“FFI-safe” stuff into function signatures, so that would be an indicator ↩︎

6 Likes
fn foo<T: ?Sized>(src: Option<&T>) -> const T {
    unsafe {core::mem::transmute(src)}
}

generates the asm i expected to see:

mov rax, rdi ; return the address
ret ; get back to callee

for Sized and

mov rdx, rsi ; return the metadata
mov rax, rdi ; return the address
ret ; get back to callee

for str and dyn core::any::Any.

In other words, it does nothing (if inlined, no additional asm is generated).

Even though you may consider this UB, IMHO it’s pretty straightforward and sound.

No, not sound at all. It is indeed UB.

fn foo<T: ?Sized>(src: Option<&T>) -> *const T {
    unsafe {core::mem::transmute(src)}
}

fn main() {
    foo::<str>(None);
}
error: Undefined Behavior: constructing invalid value: encountered uninitialized memory, but expected a raw pointer
 --> src/main.rs:2:13
  |
2 |     unsafe {core::mem::transmute(src)}
  |             ^^^^^^^^^^^^^^^^^^^^^^^^^ constructing invalid value: encountered uninitialized memory, but expected a raw pointer
  |
  = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
  = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
  = note: BACKTRACE:
  = note: inside `foo::<str>` at src/main.rs:2:13: 2:38
note: inside `main`
 --> src/main.rs:6:5
  |
6 |     foo::<str>(None);
  |     ^^^^^^^^^^^^^^^^

(output from running miri in the rust playground)

Want something more observable/practical than a miri output? Run

#![feature(slice_ptr_len)]

fn foo<T: ?Sized>(src: Option<&T>) -> *const T {
    unsafe {core::mem::transmute(src)}
}

fn main() {
    let x = foo::<str>(None);
    let n = (x as *const [u8]).len();
    if n == 0 {
        println!("hello {n}");
    }
}

on Release mode on Nightly, and you’ll get an output such as “hello 140737219510272”. Inside the if n == 0 branch. Not all that “straightforward and sound” if you ask me…


On a more positive note, miri does accept the T: Sized case, as far as I can tell :slight_smile:

10 Likes

Generates what it’s supposed to generate - just true. What’s the problem?

mov al, 1 ; dk why not “mov rax, 1”
ret

foo also generates what it is supposed to generate - nothing. Why should the most intuitive and correct (in terms of asm, at last) way be wiped outta the compiler? I, of course, cannot deny this, but also cannot see any reasons.

Pretty weird output :face_with_monocle:

transmute::<_, (usize, usize)>(None::<&str>) gives me just (0, 0).

In the second link above, it generated ud2. And that's the valid output - since all code paths inside the function are UB, the whole function can be optimized away.

1 Like

That's just Rust's compiler attempt to be nice. What C does as also perfectly valid and acceptable result.

If you hit an UB then compiler can do anything. Anything at all, here are quite literally no limitations.

1 Like

Now I see. On the playground everything behaved in another way though.

I understand it's undefined behavior, but returning false seems exactly the opposite to valid result, true. IMHO content of be isn't even related to expected result, and be == 0 || be != 0 is neither false nor "core dumped", but true.

I tried to do some research, and looks like compiler really generates whatever it's evolved AI wants. Neither foo nor len do something wrong - compiler just generates invalid code, as @VorfeedCanal said. I wash my hands.

P.S. I can't see any reason for flagging VorfeedCanal's post - even though I'm not satisfied, his notices are

.

In conclusion I want to say that IMHO being scared of undefined behavior and just throwing slippers into it is a bad choice. Instead, we should expand our capabilities and find ways to generate valid code, at least when it's possible (not NULL deref (except for pointers to ZST)). Thanks to everyone for attention and your time.

It's undefined. And undefined can be anything. true, false or core dump are “anything” thus all three are perfectly valid.

How is it invalid? It's UB! Any code is valid for UB. That's definition of UB!

That's “reclassification of UB”. It may be argued that some behaviour shouldn't be UB but have to be unspecified or implementation-defined. E.g. definition of lock leaves lots of possibilities to the implementation, but there are still some limits. That makes it “unspecified but not UB”.

Again: it's not about “being scared”. It's just the definition of UB: something that shouldn't be in your code, ever.

One may, legitimately, argue, whether some behavior or the other should be declared UB or not, but once we have applied that UB moniker to it the only legitimate choice is to remove it from your code.

4 Likes

I can see nothing wrong in a function that is a compile-time ZCA and doesn't affect the program, I see something wrong about that ZCA actually affecting the run-time behavior without even existing at run-time.

That's something you may discuss with rust language team on IRLO. They may reclassify something as non-UB (unspecified, implementation-defined, fully-defined). Such things happened in the past.

But as long as something is UB it's “your code is wrong, full stop, go fix your program”.

It's simply how UB is defined. Discussing whether something must be declared UB or not is constructive and fruitful, discussing what to do with code that triggers UB is not constructive.

You fix it and remove UB, no other options.

4 Likes