I was looking for a convenient way to cast Option<&T> to *const T (same for &mut). I thought there is a shortcut for this in std, but I didn't find anything.
The simplest thing I came up with is this, which has its drawbacks:
fn f(x: *const i32) { /***/ }
let x: Option<&i32> = Some(&12);
f(x.map_or(std::ptr::null(), |x| x));
// ^^^^ ^^^^^ raises questions "why?"
// requires import or long path
For clarity, this can be written like this:
f(x.map_or(std::ptr::null(), |x| x as *const i32));
But in the end, casting is quite tedious. Is there any easier way to do this?
std has opposite functions as_ref and as_mut. But why is there no convenient function for such an operation? Might be worth adding something like this?
I’m pretty sure majority will blame me, but I’d just use core::mem::transmute since None::<&i32> is a null pointer because of NPO. Anyway, compiler is smart enough to optimize out all the code in both cases (does nothing if inlined, just passes the argument as result if not).
Wide references/pointers additionally have no guaranteed layout match in terms of the order of the data pointer or metadata pointer (between &dyn and *const dyn), and metadata might be used for the niche instead of the data pointer.
Fortunately, you’re a little wrong. I tested this on slices too. Compiler uses the pointer as a discriminant without requiring the metadata to be null.
There's no way to give a technical reply without more details about what exactly you're calling wrong or what you tested, but speaking generally -- if by tested you mean "compiled and got the behavior you hypothesized", that's one possible outcome of undefined behavior (which is part of why it is so insidious), and doesn't mean anything as far as language guarantees go.
To be precise, the linked section only calls out transmuting None::<T> to T out as UB in these cases. T would be &i32, so all that is claimed is that transmuting Option::<&i32>::None to &i32 is (very much unsurprisingly) UB.
That being said, transmuting Option<&T> to *const Twhen T is !Sized is most likely always undefined behavior for None values, as far as I’m aware, so I agree on that point. Even if the Option would reproducibly use the pointer’s niche, and the the order between pointer and metadata was guaranteed to be the same between &T and *const T, then there’s still the problem that the metadata would be initialized with uninitialized data, which is very much problematic both for the usize metadata of slices, and also for the vtable-pointer metadata for trait objects.
For T: Sized, I thought – as far as I remember[1] – that Option<&T> is considered FFI-safe and you’re supposed to be able to use it in a function signature where the C side uses an ordinary *const T (aka “T*” pointer), so that would indicate that the transmute Option<&T> to *const T might actually be intended to be guaranteed to be safe for T: Sized types, even for the None value.
error: Undefined Behavior: constructing invalid value: encountered uninitialized memory, but expected a raw pointer
--> src/main.rs:2:13
|
2 | unsafe {core::mem::transmute(src)}
| ^^^^^^^^^^^^^^^^^^^^^^^^^ constructing invalid value: encountered uninitialized memory, but expected a raw pointer
|
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
= note: BACKTRACE:
= note: inside `foo::<str>` at src/main.rs:2:13: 2:38
note: inside `main`
--> src/main.rs:6:5
|
6 | foo::<str>(None);
| ^^^^^^^^^^^^^^^^
(output from running miri in the rust playground)
Want something more observable/practical than a miri output? Run
#![feature(slice_ptr_len)]
fn foo<T: ?Sized>(src: Option<&T>) -> *const T {
unsafe {core::mem::transmute(src)}
}
fn main() {
let x = foo::<str>(None);
let n = (x as *const [u8]).len();
if n == 0 {
println!("hello {n}");
}
}
on Release mode on Nightly, and you’ll get an output such as “hello 140737219510272”. Inside the if n == 0 branch. Not all that “straightforward and sound” if you ask me…
On a more positive note, miridoes accept the T: Sized case, as far as I can tell
Generates what it’s supposed to generate - just true. What’s the problem?
mov al, 1 ; dk why not “mov rax, 1”
ret
foo also generates what it is supposed to generate - nothing. Why should the most intuitive and correct (in terms of asm, at last) way be wiped outta the compiler? I, of course, cannot deny this, but also cannot see any reasons.
In the second link above, it generated ud2. And that's the valid output - since all code paths inside the function are UB, the whole function can be optimized away.
Now I see. On the playground everything behaved in another way though.
I understand it's undefined behavior, but returning false seems exactly the opposite to valid result, true. IMHO content of be isn't even related to expected result, and be == 0 || be != 0 is neither false nor "core dumped", but true.
P.S. I can't see any reason for flagging VorfeedCanal's post - even though I'm not satisfied, his notices are
.
In conclusion I want to say that IMHO being scared of undefined behavior and just throwing slippers into it is a bad choice. Instead, we should expand our capabilities and find ways to generate valid code, at least when it's possible (not NULL deref (except for pointers to ZST)). Thanks to everyone for attention and your time.
It's undefined. And undefined can be anything. true, false or core dump are “anything” thus all three are perfectly valid.
How is it invalid? It's UB! Any code is valid for UB. That's definition of UB!
That's “reclassification of UB”. It may be argued that some behaviour shouldn't be UB but have to be unspecified or implementation-defined. E.g. definition of lock leaves lots of possibilities to the implementation, but there are still some limits. That makes it “unspecified but not UB”.
Again: it's not about “being scared”. It's just the definition of UB: something that shouldn't be in your code, ever.
One may, legitimately, argue, whether some behavior or the other should be declared UB or not, but once we have applied that UB moniker to it the only legitimate choice is to remove it from your code.
I can see nothing wrong in a function that is a compile-time ZCA and doesn't affect the program, I see something wrong about that ZCA actually affecting the run-time behavior without even existing at run-time.
That's something you may discuss with rust language team on IRLO. They may reclassify something as non-UB (unspecified, implementation-defined, fully-defined). Such things happened in the past.
But as long as something is UB it's “your code is wrong, full stop, go fix your program”.
It's simply how UB is defined. Discussing whether something must be declared UB or not is constructive and fruitful, discussing what to do with code that triggers UB is not constructive.