No clue how to do reference arithmetic

I'm trying to do the reference implementation of arithmetic operations, but to no avail. How would this lerp function work?

pub fn lerp_unclamped<'a, T>(a: &'a T, b: &'a T, t: f32) -> T
where &'a T : Add<&'a T, Output = T> + Sub<&'a T, Output = T> + Mul<f32, Output = T> {
    a + (b - a) * t
}

Assuming this code is used with a type like:

pub struct Vec4 {
    x : f32,
    y : f32,
    z : f32,
    w : f32,
}

If we line up the expressions in the order of operations...[1]

        a + (b - a) * t
    //       ^   ^       &'a T - &'a T = T  (&'a T: Sub<&'a T, Output = T>)
    //      ^^^^^^^   ^  T * f32 = T        (T: Mul<f32, Output = T>)
    //  ^   ^^^^^^^^^^^  &'a T + T = T      (&'a T: Add<T, Output = T>)

...we can determine the required bounds.

pub fn lerp_unclamped<'a, T>(a: &'a T, b: &'a T, t: f32) -> T
where
    &'a T : Sub<&'a T, Output = T> + Add<T, Output = T>,
    T: Mul<f32, Output = T>,

  1. I'm assuming T is always the output ↩ī¸Ž

IMHO you should just require Copy and do T: Copy + Add<Output = T> + Sub<Output = T> and such.

Are you really using it on any types that shouldn't just be Copy anyway?

Wouldn't that result in at least 4 to 5 unintentional copy operations on the values though? I find that really inefficient.

It would be a very unusual numeric type which both implements Copy and which is large enough in memory that copying it is more expensive than accessing memory through a pointer.

(But in most cases, a function like this is a candidate for inlining, and once inlined, the reference vs. copy distinction is more or less entirely gone — the optimizer will transform the code to whatever it thinks is the most efficient form.)

2 Likes

Considering that I must create a temporary value anyway due to the requirement of the arithmetic operators, I guess there's no avoiding the copies. Going forward, should I just implement Copy on these kinds of types then?

Don't guess; look at godbolt and see. Do you have a runable example of the code?

And remember that copying an f32 is half the cost of passing a reference, on a 64-bit machine. Anything the size of two pointers or smaller is almost always better to just pass a copy rather than bothering indirecting through memory.

The optimizer is great. Even things like calling array::from_fn to make a new array, going through a closure and such, ends up optimizing down to just a SIMD operation anyway https://rust.godbolt.org/z/Y1hbWPTMq, for example. That also shows that passing it owned or by-reference doesn't matter in that case -- LLVM ends up turning them into the exact same code on the default x64 target.

4 Likes

Yes, you might be correct. I may have fallen prey to premature optimization in this regard.