How to deal with branch misprediction on random incoming value

boran · December 28, 2024, 11:42am

Hi,

I'd like to understand how I can design my code so it can avoid having branch misprediction on a 50:50 branch.
The side value below is randomly Side::B or Side::S without any pattern.

pub enum Side {
    B,
    S,
}
....
match side {
    Side::B => {}
    Side::S => {}
}

How can I possibly solve this problem in Rust? As I understand in C++ c++ - Template specialization for enum values - Stack Overflow, it's possible to do template specialization for enum values but I have never seen something like that in Rust.

increasing · December 28, 2024, 12:37pm

You can do const generics. They (for now) only work with integers and bools but you can assign a number to each variant and then use that.
I think that should be the same as the c++ example but i don't know enough about templates to be sure.

SebastianJL · December 28, 2024, 12:54pm

Just out of curiosity, what does template specialization have to do with branch missprediction?

boran · December 28, 2024, 1:02pm

boran · December 28, 2024, 1:05pm

If the branch can be predicted well, there isn't any issues but if it's a 50:50 random then it really impacts the performance.

stonerfish · December 28, 2024, 1:19pm

Please explain, maybe in sudo code, for someone like me, old brain, 7th grader, how that could work on a 50:50 ratio. Does predict a sequence of left:right's and write multiple blocks of asm, then branch when it is wrong maybe two or three in?

Hmm, maybe I will watch the videos and that will help.

boran · December 28, 2024, 1:55pm

Maybe my question was a little bit vague. I mean there isn't any pattern on Side value in any given time, just random Side::B or Side::S . If I can bring this down to compile time, then it's possible to execute my code branchless as in the first video.

stonerfish · December 28, 2024, 2:05pm

Maybe I asked my question vague too. I was thinking you understood a way to void having branch misprediction on a 50:50 branch using C++ and could explain.

I don't see how you could possibly optimize the testing of flipping a coin besides unwinding and compiling to a tree of branches to some depth that testing shows to be faster.

SebastianJL · December 28, 2024, 3:49pm

What do you mean by random? If it is random at runtime like user input or a call to rand::random() then I don't see how it could be compiled away.

ZiCog · December 28, 2024, 3:54pm

Am I right in understanding that you want some, potentially random, condition at compile time to select a particular function/operation which is then compiled permanently into the executable?

That is what I gather from reading the thumbnail of the video above: "If you know at compile time which function is to be executed...."

carey · December 28, 2024, 7:56pm

It looks to me like they’re looking for something like Clang’s __builtin_unpredictable to force branchless code in Rust, preferring cmove and so on, although it seems like that doesn’t even work reliably in Clang. Using branchless code would avoid the CPU ever deciding that Side::B should be predicted, making Side::S much slower, or vice versa.

As far as I know Rust doesn't support this at the moment; there’s #[cold] for functions that are unlikely to be called, or unreachable_unchecked for code that is never called, but nothing for LLVM’s unpredictable metadata, and no guarantees about what executable code will be generated by LLVM for any given Rust code.

Rather than branchless code, I see some suggestions to essentially use dynamic dispatch for this, which would use a Box<dyn Side> with SideB and SideS implementations. I guess the idea is that the CPU branch predictor won’t be clever enough to predict which implementation will be used, and will never get smart enough in the future. I wouldn’t be surprised if this ended up being slower than just mispredicting branches every now and then, though; the performance would need to be carefully measured to see if it was worth it.

ZiCog · December 28, 2024, 9:07pm

Just for fun here is a branchless selection of SideB or SideS:

//#![feature(rustc_private)]
extern crate rand;

struct SideB {}

impl SideB {
    fn mul(x: i32) -> i32 {
        x * 42
    }
}

struct SideS {}

impl SideS {
    fn mul(x: i32) -> i32 {
        x * 69
    }
}

#[no_mangle]
pub fn branchless_match(b: bool, x: i32) -> i32 {
    (b as i32) * SideB::mul(x) + ((!b) as i32) * SideS::mul(x)
}

fn main() {
    let b: bool = rand::random();

    let res = branchless_match(b, 100);
    println!("{res}");
}

The branchless_match() gets compiled to this arm64 code:

_branchless_match:
    mov  w8, #69         ; Load immediate value 69 into register w8
    mov  w9, #42         ; Load immediate value 42 into register w9
    cmp  w0, #0          ; Compare first parameter (bool) with 0
    csel w8, w9, w8, ne  ; Conditionally select between w9 and w8 based on the comparison
    mul  w0, w8, w1      ; Multiply the selected value with second parameter
    ret                  ; Return the result

No control flow branches there. Just a conditional selection.

Of course I have no idea what the OP actually wants to do so this is just for fun.

Topic		Replies	Views
Specializing over literals/enums at compile time help	6	783	January 7, 2021
Replicate this c++ code to rust	3	525	September 3, 2023
Could the logic of determining exhaustiveness of matches for enum variants be improved? help	22	1220	July 25, 2020
How to branch on generic at compile time? help	12	3558	July 9, 2020
Generic, based on constant value help	4	409	May 16, 2021

How to deal with branch misprediction on random incoming value

Related topics