How bad is panic (performancewise)?

From one of the release notes:

I started looking at the performance of my Rust based implementation against the original C++ code and made some progress, but I think the series of v0.8.x releases to follow should address this until rs-pbrt kind of matches the speed of the C++ counterpart. So far the highest priority was to be feature complete and a nearly perfect match of the resulting (rendered) images (pixel by pixel).

One aspect I was looking into is panic!(...). In the beginning I didn't care about performance at all. I just wanted to learn the language and I wanted to panic if something goes wrong, e.g.:

impl Index<usize> for RGBSpectrum {
    type Output = Float;
    fn index(&self, index: usize) -> &Float {
        match index {
            0 => &self.c[0],
            1 => &self.c[1],
            2 => &self.c[2],
            _ => panic!("Check failed: i >= 0 && i <= 2"),
        }
    }

An index into a color (or RGBSpectrum) can be expected to be called very often within a renderer. Maybe assert!() would have been a better choice, but my question remains independent of this example:

How bad is panic!(...)?

Can some of the Rust language gurus provide an answer to this, please? I'm looking into many potential bottlenecks and I found some, but with this question I try to separate my concerns into simpler questions. And the resulting code changes should improve the speed of the renderer in a measurable way (via perf and heaptrack).

Thanks in advance,

Jan

1 Like

Technically panic is almost identical to C++ throw (or abort() if you set so).

The cost of panic! is mostly indirect — its existence is seen as a side effect to the compiler, so it will prevent auto-vectorization and some other optimizations that require changing order of the code.

The solution is to help optimizer eliminate branches that contain panic. For example, you could mark the index as #[inline] or #[inline(always)] and when it's called with a constant, the optimizer may remove the panicking branch.

BTW, instead of self.c[0] you could use *self.c.get_unchecked(0), because you've already checked the index and don't want another check with another panic in that [0] impl.

You can also do:

impl Index<RGBEnum> for RGBSpectrum

and index with spectrum[RGBEnum::Red]. Then the invalid case won't exist at all, and there will be nothing to panic about.

5 Likes

Thanks a lot, @kornel. I will try out your suggestions and do some measurements. Makes absolutely sense (your answers) and it is good to know about :wink:

impl Index for RGBSpectrum

Not sure if I should open a separate question about this or not, but if I would use an enum,
I might want to iterate through it's values like here:

        let sigma_t: Spectrum = *sigma_a + *sigma_s;
        let mut rho: Spectrum = Spectrum::new(0.0 as Float);
        for c in 0..3 {
            if sigma_t[c] != 0.0 as Float {
                rho.c[c] = sigma_s[c] / sigma_t[c];
            } else {
                rho.c[c] = 0.0 as Float;
            }
        }

I found a question on Stack Overflow which suggests in one of the answers the crate strum. Would you agree with that answer (in this case) or is there a better way (I could just cut and paste the code three times, adjust it, and be happy with that)?

It's #[inline(always)]

I would inspect the assembly with godbolt or cargo asm before using the unchecked versions. Then measure before and after you apply optimizations to see if they were necessary andd if they worked!

Yep, strum is a nice crate for iterating over enums.

1 Like

Don't use for _ in 0..len loops, they're usually the slowest form that is hardest to optimize.

For example, you could use:

for (rho, sigma) in rho.iter_mut().zip(sigma.iter()) {
   *rho = *sigma;
}

This would work if you implemented Deref<Target=[Float; 3]> on Spectrum (or AsRef<[Float; 3]> or your own iterator if you have to).

1 Like

Thanks, @RustyYato, for your answers.

Good to know, I will try that in some other places.

I used an enum for the RGB indices and the strum crate. Seems to work fine. I haven't measured the performance against the previous version though, but I think I will do some other changes and investigate if the performance got better tomorrow ...

Thanks @kornel.

The original C++ code is more complicated:

template <int nSpectrumSamples>
class CoefficientSpectrum {
  public:
...
  protected:
    // CoefficientSpectrum Protected Data                                                                         
    Float c[nSpectrumSamples];
};
...
class RGBSpectrum : public CoefficientSpectrum<3> {
    using CoefficientSpectrum<3>::c;

  public:
...
};

But I did change a lot of the for _ in 0..len loops in other places. I just have to go through the source code more carefully (again). I think clippy suggested a couple of places and back then I changed a lot of them, but not all. It was just too much to digest.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.