Unsafe code vs. dancing with the optimizer

I peeked at the recently-const-stabilized slice reverse source code. Here's what it looks like:

#[stable(feature = "rust1", since = "1.0.0")]
#[rustc_const_stable(feature = "const_slice_reverse", since = "1.90.0")]
#[inline]
pub const fn reverse(&mut self) {
    let half_len = self.len() / 2;
    let Range { start, end } = self.as_mut_ptr_range();

    // These slices will skip the middle item for an odd length,
    // since that one doesn't need to move.
    let (front_half, back_half) =
        // SAFETY: Both are subparts of the original slice, so the memory
        // range is valid, and they don't overlap because they're each only
        // half (or less) of the original slice.
        unsafe {
            (
                slice::from_raw_parts_mut(start, half_len),
                slice::from_raw_parts_mut(end.sub(half_len), half_len),
            )
        };

    // Introducing a function boundary here means that the two halves
    // get `noalias` markers, allowing better optimization as LLVM
    // knows that they're disjoint, unlike in the original slice.
    revswap(front_half, back_half, half_len);

    #[inline]
    const fn revswap<T>(a: &mut [T], b: &mut [T], n: usize) {
        debug_assert!(a.len() == n);
        debug_assert!(b.len() == n);

        // Because this function is first compiled in isolation,
        // this check tells LLVM that the indexing below is
        // in-bounds. Then after inlining -- once the actual
        // lengths of the slices are known -- it's removed.
        // FIXME(const_trait_impl) replace with let (a, b) = (&mut a[..n], &mut b[..n]);
        let (a, _) = a.split_at_mut(n);
        let (b, _) = b.split_at_mut(n);

        let mut i = 0;
        while i < n {
            mem::swap(&mut a[i], &mut b[n - 1 - i]);
            i += 1;
        }
    }
}

I noticed that there's quite a dance done between the author and the optimizer to avoid both calling unsafe functions and paying the penalty of bounds checks. See, in particular, the creation of the a and b slices.

I wonder if implementing such a function using lower-level unsafe pointer arithmetic would be an improvement. There's more unsafe code, but it more directly conveys the programmer's intent and avoids dependency on specific optimization passes, which I assume aren't guaranteed.

I'm new to Rust, so I would be very interested in hearing experienced folks' opinions here.

2 Likes

It sounds like you understand the trade-offs of the two solutions. I can't say more than the author did not think additional unsafe was worth it.

6 Likes

Blame is useful for narrowing down answer to your question.

1 Like

Note that almost nothing is "guaranteed" when you're talking performance. Even inline(always) is technically just a hint. Even if you manually remove the bounds checks with unsafe code, it's still depending on the optimizer doings a bunch of clever vectorization to make it actually perform well -- but that's an advantage, because it lets the compiler adapt it to the target-cpu in a way that's not feasible (sometimes not even possible) in the library itself.

So while nothing is guaranteed, there's a few patterns that make it easy on the optimizer, and thus reliable in practice.

And to me, the magic of Rust is all about letting you write safe code that optimizes well instead of needing to do the unsafe yourself.

4 Likes

And to me, the magic of Rust is all about letting you write safe code that optimizes well instead of needing to do the unsafe yourself.

I wish that actually were the case, but you can never be too sure what an optimizing compiler will do.
This "optimizer nudging", to me, feels eerily similar to how some programmers justify invoking UB in unsafe languages with "well, I've checked the assembly and it's legal on my target platform".

1 Like

Where do you stop with that thinking, though? Switch to writing everything in assembly because you can "never be too sure"?

It's completely different because safe code -- like asserts -- aren't ticking time bomb. Typically the people abusing UB never actually check it on every toolchain update. (And really they need to check it on every build, even with the same compiler.)

Using asserts to allow optimizing out range checks by ensuring they're against the same local variable is just taking advantage of optimization techniques from the 1980s -- as with lots of things in Rust, really, as you can see in the presentation linked from https://graydon2.dreamwidth.org/247406.html

4 Likes

Sure, they can't cause an UB, but they sure do specialize for the LLVM backend.

Have you tried them with the GCC backend? I would expect it to take advantage of them similarly -- just like it doesn't take rocket science for an optimizing backend to remove the divide-by-zero panic when you write if d > 0 { Some(n / d) } else { None }.

Conditions tell useful things to the optimizer, of which it can then take advantage.

Note that it would depend very much on your definition of “improvement”.

And then you would have slower code, in the end. This may sound counter-intuitive, but… do you know why there are so many Fortran packages used in HPC? Because they are often faster than C analogues. But why they are faster? Because Fortran restricts you more. C can match Fortran with the use of restrict, but it's very hard to use it correctly.

And “tricks” that you want to exclude do the same thing: they give the compiler extra important that enable better code generation. Addition of unsafe wouldn't make that work easier, in fact it would make it harder, in most cases.

You may add some unreachable_unchecked calls to educate the compiler, but at this point you have code that's both full of unsafe and then more unsafe on top of unsafe. Rarely an improvement. Yes, sometimes it's worth doing… but not too often.

I'm not really talking about the asserts, early abort on failed precondition is just good coding style.

I'm more displeased by the inner function boundary trick, that relies on inlining to not incur a cost.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.