Pushing down array allocation down into branches makes code significantly slower

I have some Rust code as below (simplified illustration).

fn g() {
    let arr: ArrayVec<usize, 3> = gen_arr(id);
    match v {
        c1 => f1(&arr);
        c2 => f2(&arr);
        _ => f3(&arr);
    }
}

fn f1(arr: &[usize]) {
    match v {
        c1 => f11(&arr),
        c2 => f12(&arr),
        c3 => f13(&arr),
    }
}

fn f2(arr: &[usize]) {
    match v {
        c1 => f21(&arr),
        c2 => f22(&arr),
        c3 => f23(&arr),
    }
}

fn f3(arr: &[usize]) {
    match v {
        c1 => f31(&arr),
        c2 => f32(&arr),
        c3 => f33(&arr),
    }
}

Convert it to

fn g() {
    match v {
        c1 => f1();
        c2 => f2();
        _ => f3();
    }
}

fn f1() {
    match v {
        c1 => f11(),
        c2 => f12(),
        c3 => f13(),
    }
}

fn f2() {
    match v {
        c1 => f21(),
        c2 => f22(),
        c3 => f23(),
    }
}

fn f3() {
    match v {
        c1 => f31(),
        c2 => f32(),
        c3 => f33(),
    }
}

where f11, f12, f13, f21, f22, f23, f31, f32 and f33 calculates arr exactly once inside them.
The ArrayVec arr contains exactly 3 usize integers
and the function gen_arr takes about 30 CPU cycles to generate arr.
The function g takes about 500k CPU cycles to run.

Benchmark shows that the second version is significantly (about 7.5%) slower.
What are possible reasons that have caused this?

Speculation: in the first case, the lack of dependency between computing arr and branching on v is obvious and exploited by the optimizer and CPU. In the second case, it is not.

But to get non-speculative answers, you will need to examine the generated assembly for the real program.

1 Like

I also had a speculative idea which was related to branch misprediction. The second case has (slightly) larger branches and thus branch misprediction is more expensive. However, the part this speculative idea cannot explain is that the cost of calculating arr is very small compared to existing branches.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.