How to optimize count_ones + compare

Hi all,

I'd like to optimize the following code, and am unsure if the compiler will do the best for me:

fn enough_ones(bits: u64, enough: u32) -> u64 {
  if bits.count_ones() >= enough { bits } else { 0 }
}

Can anyone tell me if this code is optimal? I could hope that this will be implemented without a branch, but when I put it into godbolt it doesn't look encouraging. But I'm unfamiliar with godbolt, and may have missed how to enable optimizations...

If rust can't do this without a branch, it would seem worth doing some bit-twiddling to avoid the branch.

You need to enter -O (or -C opt-level=3) in the “Compiler options” field on godbold to enable optimizations.

Here's the optimized version.

2 Likes

Thanks! That looks more like what I was expecting, and confirms my suspicion that the compiler can do better than I can at optimizing this kind of code.

1 Like

Also, don't forget that you'll get something different if you target a newer cpu than the default old i686: https://rust.godbolt.org/z/xYja7M

2 Likes

Ah, that looks way more encouraging!

Worth noting that -O is equivalent to -C opt-level=3, just in case someone notices a difference.

You mean that -O is equivalent to -C opt-level=2, right?

1 Like

Whoops :joy: The one thing I'm trying to clarify and I mistype it. Yes, it's equivalent to 2, not 3.

2 Likes