Obviously, you should validate your values before you construct them. However, for the sake of this question lets say you have the following code.
use std::num::NonZeroU32;
pub fn validate_nzu32(x: NonZeroU32) {
if x.get() == 0 {
// Use abort to not clutter the asm with panic code.
::std::process::abort();
}
}
pub fn validate_nzu32_slice(xs: &[NonZeroU32]) {
for x in xs {
validate_nzu32(*x);
}
}
example::validate_nzu32:
push rax
test edi, edi
je .LBB0_1
pop rax
ret
.LBB0_1:
call std::process::abort@PLT
ud2
example::validate_nzu32_slice:
ret
Can anyone tell me why the slice case is optimized out (meaning it's not doing what you think!) but the single value version isn't? The fact that this can sometimes "work" is dangerous if someone experimentally establishes that you can do this and then continues to develop under this assumption.
Note: When doing the same thing for bools, the single value function is reduced to nothing like you would (or wouldn't) expect.
Regarding dangerousness, this is well within the domain of undefined behavior, where if you ever assume the optimizer will "do what you think", you're probably going to have a bad time.
But if you're just curious…
It's because LLVM's range metadata, which allows the frontend to specify that an instruction's result must be in a certain range of integer values, is (for some reason) only supported on load and call instructions. The second function uses a load; the first does not. If you compile to LLVM IR rather than assembly, you can search for !range to see what gets emitted. And here is the code in rustc that actually emits that metadata.
If you look at --emit=llvm-ir without optimization, validate_nzu32 doesn't have range information on the parameter, nor does NonZero::get on its parameter or return value. But in validate_nzu32_slice, the load from the slice does have range !4 = !{i32 1, i32 0}.
I really appreciate all the information here guys, thanks!
Even though it is obvious to me now, I did not consider that checks could be optimized out. A real world case where you have to be careful is when you are using FFI to initialize or update values in place and want to validate them after. For example:
use std::num::NonZeroU32;
extern "C" {
pub fn maybe_write_u32(ptr: *mut u32);
}
pub struct Thing {
x: NonZeroU32,
}
impl Thing {
pub fn update(&mut self) {
unsafe {
maybe_write_u32(&mut self.x as *mut _ as *mut u32);
let rx = &*(&self.x as *const _ as *const u32);
if rx == &0 {
::std::process::abort();
}
}
}
}