Differing valid value optimizations for single value and slice

mickvangelderen · September 21, 2018, 3:30pm

Obviously, you should validate your values before you construct them. However, for the sake of this question lets say you have the following code.

use std::num::NonZeroU32;

pub fn validate_nzu32(x: NonZeroU32) {
    if x.get() == 0 {
        // Use abort to not clutter the asm with panic code.
        ::std::process::abort();
    }
}

pub fn validate_nzu32_slice(xs: &[NonZeroU32]) {
    for x in xs {
        validate_nzu32(*x);
    }
}

example::validate_nzu32:
        push    rax
        test    edi, edi
        je      .LBB0_1
        pop     rax
        ret
.LBB0_1:
        call    std::process::abort@PLT
        ud2

example::validate_nzu32_slice:
        ret

Code on rust.godbolt.org

Can anyone tell me why the slice case is optimized out (meaning it's not doing what you think!) but the single value version isn't? The fact that this can sometimes "work" is dangerous if someone experimentally establishes that you can do this and then continues to develop under this assumption.

Note: When doing the same thing for bools, the single value function is reduced to nothing like you would (or wouldn't) expect.

jonh · September 21, 2018, 3:43pm

If you change to take a reference it also optimizes out. So question becomes why does it not do so with value type.

comex · September 21, 2018, 4:15pm

Regarding dangerousness, this is well within the domain of undefined behavior, where if you ever assume the optimizer will "do what you think", you're probably going to have a bad time.

But if you're just curious…

It's because LLVM's range metadata, which allows the frontend to specify that an instruction's result must be in a certain range of integer values, is (for some reason) only supported on load and call instructions. The second function uses a load; the first does not. If you compile to LLVM IR rather than assembly, you can search for !range to see what gets emitted. And here is the code in rustc that actually emits that metadata.

cuviper · September 21, 2018, 4:16pm

If you look at --emit=llvm-ir without optimization, validate_nzu32 doesn't have range information on the parameter, nor does NonZero::get on its parameter or return value. But in validate_nzu32_slice, the load from the slice does have range !4 = !{i32 1, i32 0}.

(That is, ditto @comex)

scottmcm · September 21, 2018, 4:19pm

Some additional information in this bug about this situation:

https://github.com/rust-lang/rust/issues/49572

mickvangelderen · September 21, 2018, 5:08pm

I really appreciate all the information here guys, thanks!

Even though it is obvious to me now, I did not consider that checks could be optimized out. A real world case where you have to be careful is when you are using FFI to initialize or update values in place and want to validate them after. For example:

use std::num::NonZeroU32;

extern "C" {
    pub fn maybe_write_u32(ptr: *mut u32);
}

pub struct Thing {
    x: NonZeroU32,
}

impl Thing {
    pub fn update(&mut self) {
        unsafe {
            maybe_write_u32(&mut self.x as *mut _ as *mut u32);
            let rx = &*(&self.x as *const _ as *const u32);
            if rx == &0 {
                ::std::process::abort();
            }
        }
    }
}

Code on rust.godbolt.org

Pretty gnarly. Unsafe is hard, can't wait for guidelines.

Topic		Replies	Views
Exploring different implementations of the same functionality in optimized assembly help	19	1216	January 12, 2023
Don't lie to the compiler	4	1041	January 12, 2023
Eliminating redundant bounds checks on read+write mutable slices help	5	263	December 19, 2023
I'm really surprised that this compiles	75	2071	March 18, 2023
Any safety/provenance difference between ways of casting byteslice to integer?	5	372	July 22, 2022

Differing valid value optimizations for single value and slice

Related Topics