How std operator overload impl not recursion?

sydan957 · December 14, 2023, 9:13am

So I am looking into rust and see this code in arith.rs and see the following code and my stupid brain cannot see how this is not recursion:

macro_rules! add_impl {
    ($($t:ty)*) => ($(
        #[stable(feature = "rust1", since = "1.0.0")]
        impl Add for $t {
            type Output = $t;

            #[inline]
            #[track_caller]
            #[rustc_inherit_overflow_checks]
            fn add(self, other: $t) -> $t { self + other }
        }

        forward_ref_binop! { impl Add, add for $t, $t }
    )*)
}

Is + and fn add the same? I guess there must be some compiler magic that detect and map this + to actual binary add instruction?

vague · December 14, 2023, 9:23am

Add trait is a lang item, and that's the magic I think.

#[lang = "add"]
pub trait Add<Rhs = Self> { }

And it's possible to replace it with your own counterpart.

When you put #[lang = "add"] on a trait, the compiler knows to call YourTrait::add(x, y) when it encounters the addition operator. Of course, usually the compiler will already have been told about such a trait since libcore is usually the first library in the pipeline.

And the official introduction about lang items: Lang Items - Rust Compiler Development Guide

DanielKeep · December 14, 2023, 9:30am

Yes, except...

...addition for built-in types like i32, as you suspected, doesn't go through Add::add and instead maps directly to a built-in operation. The Add implementations exist for consistency; without them, you couldn't use i32 in a generic function that wants to add values together, for example.

SebastianJL · December 14, 2023, 10:07am

I wonder why it doesn't though. Couldn't the trait impl say builtin_add(self, other) and then + could unconditionally map to Add::add? I don't know if it would be better or worse, I just wonder why it isn't that way.

DanielKeep · December 14, 2023, 10:35am

My guess: because then the compiler would have to chew through an extra layer of indirection for every primitive type addition, subtraction, multiplication, etc. in all code everywhere. This way, it can pre-emptively shortcut that and just directly insert the operation it was going to use anyway.

One of the long-standing issues with compile speed has been that rustc generates too damn much LLVM bitcode. I suspect your suggestion would make that worse. So probably not an improvement.

khimru · December 14, 2023, 10:53am

So you are replacing one magical construct with another.

Less code generated for LLVM, faster compile time, the same amount of magic involved.

SebastianJL · December 14, 2023, 11:26am

Yes indeed. But it would be a little more obvious what's happening and at least consistent that + is always Add. Also no Add implementations that don't actually get called. But from what you guys write it seems to be the way better solution from a performance standpoint.

farnz · December 14, 2023, 4:48pm

Effectively Rust does have two implementations of + (and other operators):

For certain types such as i32, + is a built-in, and is implemented directly by the compiler. It's equivalent to builtin_add(self, other), just spelt self + other.
For other types, and pre-monomorphization for generics, it's syntax sugar for <T as Add>::add(self, other). If T happens to be a type where + is a built-in, then the implementation of <T as Add>::add uses the built-in (and will always be able to do this, because the implementation is monomorphized by this point).

Doing this way has the advantage that where Rust knows that you've got to be using the built-in, it can generate code directly, without having to go through the abstraction and optimize it out. The disadvantage is that it's hard for people to see what's going on, since the same spelling is used for two different operations.

quinedot · December 14, 2023, 5:15pm

I prefer to say "notionally syntax sugar" as subtle differences between many operator expressions and the corresponding trait method invocations (which can also be suprising when you're deep diving on an operator, especially since the documentation likes to throw around the term "equivalent" too).

tczajka · December 14, 2023, 5:37pm

Is this significant in terms of compile time? I doubt it. It's inlining one very simple function.

If it is difficult for the compiler to optimize it for some reason, the optimization could be special-cased in the compiler to work around it without affecting the semantics of the language, so it seems like a dubious rationale.

farnz · December 14, 2023, 5:53pm

It is special-cased in the compiler already - given a, b: i32, a + b is compiled immediately to the right thing, without indirection via impl Add<Rhs=i32> for i32. This works out exactly the same as having a builtin_add and special-casing in the compiler.

SkiFire13 · December 14, 2023, 6:35pm

FYI this is also how Deref for Box and references, and Box::new are implemented.

tczajka · December 14, 2023, 6:35pm

Not sure about Add, but for AddAssign there is definitely a difference in semantics, so it's not exactly the same:

let mut x = 5;
x += { x = 1; 2 };  // works
x.add_assign({x = 1; 2});  // doesn't work

SebastianJL · December 14, 2023, 7:59pm

Maybe paste the error when you post something like this.

Compiling playground v0.0.1 (/playground)
error[E0689]: can't call method `add_assign` on ambiguous numeric type `{integer}`
 --> src/main.rs:7:7
  |
7 |     x.add_assign({
  |       ^^^^^^^^^^
  |
help: you must specify a type for this binding, like `i32`
  |
2 |     let mut x: i32 = 5;
  |              +++++

For more information about this error, try `rustc --explain E0689`.
error: could not compile `playground` (bin "playground") due to previous error

tczajka · December 14, 2023, 8:18pm

This is the error (playground):

error[E0506]: cannot assign to `x` because it is borrowed
 --> src/main.rs:6:20
  |
6 |     x.add_assign({ x = 1; 2 }); // doesn't work
  |     - ----------   ^^^^^ `x` is assigned to here but it was already borrowed
  |     | |
  |     | borrow later used by call
  |     `x` is borrowed here

The borrows happen in a different order for the operator with built-in types vs the method call. This demonstrates that the operator for built-in types behaves differently from desugaring to the method call.

scottmcm · December 14, 2023, 11:42pm

It's not particularly difficult, but it's pervasive, so might matter. Notably, in the MIR level, a function call terminates a basic block which a binary operator on a primitive does not.

So it's not just the function calls, but also whether return a + b + c is one basic block or four.

EDIT: Brainfart; ignore this

Also, Add::add takes references, so it's the difference between

c = a + b

and

bb1:
let temp1 = &a;
let temp2 = &b;
let temp3 = Add::add(temp1, temp2) goto bb2 cleanup bb3;

bb2:
c = temp3;

bb3:
resume;

And the references are much harder to remove, practically, than the inlining. (As always, indirection is harder than by-value.)

tczajka · December 15, 2023, 12:02am

It doesn't, it takes operands by value.

scottmcm · December 15, 2023, 1:04am

Doh, you're of course correct there.

What I wrote is why it's important that < be a primitive, not why + should be.

system · March 14, 2024, 1:05am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
How rust prevent dead loop for op? help	13	493	May 3, 2021
Generic arithmetic with std::ops::Add? help	8	6387	January 12, 2023
Replacing C++ overloading with idomatic Rust community	12	914	September 12, 2022
Impl on std::ops::Add // reducing repetition	2	320	July 22, 2021
Why this does not lead to recursion?	12	1094	January 17, 2021

How std operator overload impl not recursion?

Related Topics