Consider the following code:
/// minimum bytes to represent a signed i32,
/// not considering the sign bit
pub fn mbsi32(x: i32) -> u8 {
// To fit in n bytes, we require that
// everything but the leading sign bits fits in n*8
// bits.
let n_sign_bits = if x.is_negative() {
x.leading_ones() as u8
} else {
x.leading_zeros() as u8
};
(32 - n_sign_bits + 7) / 8
}
In release mode with the default target, it compiles to the following:
playground::mbsi32:
mov eax, edi
sar eax, 31
xor eax, edi
je .LBB1_1
bsr ecx, eax
xor ecx, 31
mov al, 39
sub al, cl
shr al, 3
ret
.LBB1_1:
mov ecx, 32
mov al, 39
sub al, cl
shr al, 3
ret
I'm wondering why LBB1_1
can't be optimized to just ret
. It can only be reached if rax
is 0, in which case, shouldn't LLVM be able to do constant folding to figure out that it always returns 0?