How to make an integer with N bits set without overflow

I want a function f : usize -> u32 such that:

  • f(4) = 0b1111
  • f(8) = 0b11111111
  • f(22) = 0x3FFFFF
  • f(32) = 0xFFFFFFFF

The last one is the hardest case.

The canonical c/c++ implementation would be

fn setbits(x: usize) -> u32 {
    (1 << x) - 1
}

This will overflow for x=32, although unsigned overflow is defined such that this works, and signed integer is UB till C++20, but will most likely work correctly

In rust this doesn't work in Debug mode as it panics on the overflow. and AFAICT all the defined shl() methods (overflowing, checked, wrapping) do not give the correct semantics (1 << 32 = 1 not 0)

I could add a branch to check n <= 32, but this doesn't optimize away: Compiler Explorer

I could do the shift in one size bigger integer, but for example if I use usize, I don't know how to get the one size bigger integer automatically, and it doesn't optimize away either: Compiler Explorer

Is there an implementation for either the correct overflow semantics, or for achieving this directly?

1 Like

Your godbolt links do not match your function. They take two arguments.

pub fn setbits(x: u8) -> u32 {
    u32::MAX >> (32 - x)
}
2 Likes

Oh oops, although the assembly is still different if you remove the first parameter and replace it with 1

Ah, thanks I should have thought of that, thats perfect

That doesn't work for x==0 though, either panicking in debug or leaving it as MAX in release.

2 Likes

I believe even the unsigned case is undefined in all versions of C++. The following program prints either 0 or 4294967295 for me, depending on the optimization level, when compiled with g++ 11.1.0 using -std=c++2a:

#include <iostream>
#include <cstdint>

uint32_t setbits(uint32_t x) {
    return (1 << x) - 1;
}

int main() {
  std::cout << setbits(32) << std::endl;
}

The Signed Integers are Two’s Complement proposal states that:

  • Status-quo shift by larger-than or equal-to bit-width remains undefined.

The latest working draft standard says:

The behavior is undefined if the right operand is negative, or greater than or equal to the width of the promoted left operand.

2 Likes

This is harder than I thought

Considering that the x86 instruction, afaik, offer the semantics needed for this function (bits slide of the end and shifting by >= the number of bits in an integer always gives 0) is it worth adding a function that does this? I don't know what the name would be but perhaps modulo_shl or masked_out_shl/wrapping_out_shl, because it's shifting and the output is masked rather than the shift out. Maybe even wrapping_mul_by_power_of_2 because that's really the semantics we're getting

And/or maybe a function that sets the N lsbs?

Hard to say. Rust also doesn't include methods for more-common bitwise things like "test if bit is set" in its stdlib. (Or most of the BMI1/BMI2 stuff.)

So unless there's an LLVM intrinsic or a instruction for it on a common processor that's not otherwise pattern-matched at codegen by LLVM, my instinct is that it goes in a bit manipulation crate of some sort, not in core.

Also, consider whether support for zero is really important. One can always make life easier in the implementation with

pub fn setbits(x: std::num::NonZeroU32) -> u32 {
    u32::MAX >> (32_u32 - x.get())
}

Should every arcane aspect of every CPU architecture be reflected in special Rust functions? Or should Rust focus on supporting those architectural features that are commonly supported by other widely-used languages such as C and C++?

Actually x86 does not have this semantics. Shifting a 32-bit register by 32 bits on x86 is a no-op.

Returning 0 would be consistent semantics, and so I think it's a mistake that << doesn't implement it in Rust, and instead reflects CPU weirdness when it comes to shifting u32 by 32 bits.

1 Like

I'd dispute that is arcane, the main reason I want it is that it's a common function needed for making bitmasks.
Apparently i was wrong about how x86 works but this is still a thing I need. In fact, the subtlety of when these kinds of shifts are undefined or wrong make me think a function for doing this, and/or for directly making bitmasks is even more useful

Rust does return 0, you just need to disable overflow checks or be explicit and use wrapping_shl

I'll be less flippant. I've used this approach myself to create low-order bit masks, at least as far back as 1965 when shifts were implemented serially and thus the technique worked. However, a shift distance that shifts everything out of a register is often not supported by modern CPU ISAs. This probably is because many implementations of that ISA are expected to use a hardware barrel shifter, which only implements shift distances less than the register width.

For example, in RISC-V, shift instructions specify either an immediate shift distance of a register from which a computable shift distance is obtained. The RISC-V ISA with 32-bit registers, RV32I, extracts the lower 5 bits from the specified immediate or register operand, supporting shift distances of 0..31 (i.e., 25 - 1). Similary, the RISC-V ISA with 64-bit registers, RV64I, extracts the lower 6 bits from the specified immediate or register operand, supporting shift distances of 0..63 (i.e., 26 - 1) while the draft RISC-V ISA with 128-bit registers, RV128I, extracts the lower 7 bits from the specified immediate or register operand, supporting shift distances of 0..127 (i.e., 27 - 1) . In no case is it possible to specify a shift equal to or larger than the register width.

Of course Rust permits you to define your own function to accomplish this extended-distance shift. IMO that definition should be in a crate under your control, not something decided for everyone else. A different programmer might want to report an error when the shift distance is excessive, which your proposal would obscure.

3 Likes

Not true:

println!("{}", 1u32.wrapping_shl(32))

prints 1.

1 Like

Maybe I misunderstood, I was talking about the initial (1 << x) - 1

This also doesn't work with x = 32, even if you use wrapping_shl and wrapping_sub. Instead of 0xFFFFFFFF you'd get 0 for x = 32.

It would be very easy to support n=32 properly in hardware without significant cost. You just need to look at the top bit(s) and if they are not zero, then mask everything with 0. This doesn't add much to the shifter.

I think the full range 0..=32 is pretty useful in practice, as the example in the thread shows.

In Rust today, I would implement it either like this:

pub fn shift_left_1(x: u32, n: u32) -> u32 {
    if n < 32 { x << n } else { 0 }
}

The if gets compiled to a conditional move without jumps on x86.

Or, using explicitly jump-free code:

pub fn shift_left_2(x: u32, n: u32) -> u32 {
    x.wrapping_shl(n) & 0u32.wrapping_sub(u32::from(n < 32))
}
2 Likes

Are you sure? With checked_shl, you get to chose what value the edge case returns:

fn f(x: u32) -> u32 {
    u32::checked_shl(1, x).unwrap_or(0).wrapping_sub(1)
}

fn main() {
    assert_eq!(f(0), 0);
    assert_eq!(f(4), 0b1111);
    assert_eq!(f(8), 0b11111111);
    assert_eq!(f(22), 0x3FFFFF);
    assert_eq!(f(32), 0xFFFFFFFF);
}
4 Likes

Since masking is often used in cryptographic and other secret-sensitive code, this is the only general (i.e., implementation-independent) code that I would consider. Relying on code with conditional branches leaks information through a timing side-channel. Relying on architectural conditional instructions fails side-channel secrecy with no warning to the user on implementations where execution is not constant time. And of course they don't work at all on architectures where those conditional instructions don't exist.

2 Likes