Is function splitting optimization effective?

For example, I'm considering the rewriting

fn eq<T: Eq>(a: &[T], b:[T]) -> bool {
    if a.len() != b.len() {
        return false;
    }

    // element-wise comparison
}

to

fn eq<T: Eq>(a: &[T], b:[T]) -> bool {
    if a.len() != b.len() {
        return false;
    }

    elem_wise_cmp(a, b)
}

fn elem_wise_cmp<T: Eq>(a: &[T], b: &[T]) -> T {
    // element-wise comparison
}

I expect that eq is inlined and elem_wise_cmp isn't inlined so we can avoid function call cost if lens aren't equal and can reduce code size explosion by inlining.

If the calculation of the condition of the early return is enough small, early return is enough likely to happen and the calculation of the split part is enough big, is this optimization effective? (edited)

If so, the above example effective?

You can add #[inline] to functions you'd prefer inlined and #[inline(never)] to functions you don't want inlined. Otherwise it's up to the optimizer to guess what you mean.

@kornel
Do optimizer inline functions partially?

Did you profile your code? I read recently (can't remember where, might have been on this forum) that it is better to first get stuff working and then, if it is too slow for your purposes, profile it to see where the slow spots are. Otherwise you waste a lot of time to try and optimize stuff which are fast enough already and introduce bugs with your fancy algorithms.

1 Like

With

#[inline]
fn eq<T: Eq>(a: &[T], b:[T]) -> bool {
    if a.len() != b.len() {
        return false;
    }

    elem_wise_cmp(a, b)
}

fn elem_wise_cmp<T: Eq>(a: &[T], b: &[T]) -> T {
    // element-wise comparison
}

The compiler is hinted at inlining the body of eq. This means that a eq(a, b) at call site is very likely to be replaced with

if a.len != b.len { // slice len() getter is inlined too
    false
} else {
    elem_wise_cmp() // <- this could be inlined too
}

Now, the elem_wise_cmp() may also be inlined if the compiler deems it necessary.

So, yes, by splitting the function the compiler has more room to manoeuver with. Adding inline-ing hints, however, should always be a carefully thought decision. I, for instance, only #[inline] zero-cost abstraction functions such as struct constructors or getters / setters.

Yes, it can be effective, especially if the split is between a generic outer function that calls a non-generic inner function.

For example, from std::fs:

https://github.com/rust-lang/rust/blob/597f432489f12a3f3341/src/libstd/fs.rs#L260-L269

4 Likes

Thanks a lot!
I will use what you tell me!

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.