How std operator overload impl not recursion?

So I am looking into rust and see this code in arith.rs and see the following code and my stupid brain cannot see how this is not recursion:

macro_rules! add_impl {
    ($($t:ty)*) => ($(
        #[stable(feature = "rust1", since = "1.0.0")]
        impl Add for $t {
            type Output = $t;

            #[inline]
            #[track_caller]
            #[rustc_inherit_overflow_checks]
            fn add(self, other: $t) -> $t { self + other }
        }

        forward_ref_binop! { impl Add, add for $t, $t }
    )*)
}

Is + and fn add the same? I guess there must be some compiler magic that detect and map this + to actual binary add instruction?

2 Likes

Add trait is a lang item, and that's the magic I think.

#[lang = "add"]
pub trait Add<Rhs = Self> { }

And it's possible to replace it with your own counterpart.

When you put #[lang = "add"] on a trait, the compiler knows to call YourTrait::add(x, y) when it encounters the addition operator. Of course, usually the compiler will already have been told about such a trait since libcore is usually the first library in the pipeline.

And the official introduction about lang items: Lang Items - Rust Compiler Development Guide

1 Like

Yes, except...

...addition for built-in types like i32, as you suspected, doesn't go through Add::add and instead maps directly to a built-in operation. The Add implementations exist for consistency; without them, you couldn't use i32 in a generic function that wants to add values together, for example.

5 Likes

I wonder why it doesn't though. Couldn't the trait impl say builtin_add(self, other) and then + could unconditionally map to Add::add? I don't know if it would be better or worse, I just wonder why it isn't that way.

My guess: because then the compiler would have to chew through an extra layer of indirection for every primitive type addition, subtraction, multiplication, etc. in all code everywhere. This way, it can pre-emptively shortcut that and just directly insert the operation it was going to use anyway.

One of the long-standing issues with compile speed has been that rustc generates too damn much LLVM bitcode. I suspect your suggestion would make that worse. So probably not an improvement. :slight_smile:

2 Likes

So you are replacing one magical construct with another.

Less code generated for LLVM, faster compile time, the same amount of magic involved.

Yes indeed. But it would be a little more obvious what's happening and at least consistent that + is always Add. Also no Add implementations that don't actually get called. But from what you guys write it seems to be the way better solution from a performance standpoint.

Effectively Rust does have two implementations of + (and other operators):

  1. For certain types such as i32, + is a built-in, and is implemented directly by the compiler. It's equivalent to builtin_add(self, other), just spelt self + other.
  2. For other types, and pre-monomorphization for generics, it's syntax sugar for <T as Add>::add(self, other). If T happens to be a type where + is a built-in, then the implementation of <T as Add>::add uses the built-in (and will always be able to do this, because the implementation is monomorphized by this point).

Doing this way has the advantage that where Rust knows that you've got to be using the built-in, it can generate code directly, without having to go through the abstraction and optimize it out. The disadvantage is that it's hard for people to see what's going on, since the same spelling is used for two different operations.

2 Likes

I prefer to say "notionally syntax sugar" as subtle differences between many operator expressions and the corresponding trait method invocations (which can also be suprising when you're deep diving on an operator, especially since the documentation likes to throw around the term "equivalent" too).

1 Like

Is this significant in terms of compile time? I doubt it. It's inlining one very simple function.

If it is difficult for the compiler to optimize it for some reason, the optimization could be special-cased in the compiler to work around it without affecting the semantics of the language, so it seems like a dubious rationale.

It is special-cased in the compiler already - given a, b: i32, a + b is compiled immediately to the right thing, without indirection via impl Add<Rhs=i32> for i32. This works out exactly the same as having a builtin_add and special-casing in the compiler.

1 Like

FYI this is also how Deref for Box and references, and Box::new are implemented.

2 Likes

Not sure about Add, but for AddAssign there is definitely a difference in semantics, so it's not exactly the same:

let mut x = 5;
x += { x = 1; 2 };  // works
x.add_assign({x = 1; 2});  // doesn't work

Maybe paste the error when you post something like this.

Compiling playground v0.0.1 (/playground)
error[E0689]: can't call method `add_assign` on ambiguous numeric type `{integer}`
 --> src/main.rs:7:7
  |
7 |     x.add_assign({
  |       ^^^^^^^^^^
  |
help: you must specify a type for this binding, like `i32`
  |
2 |     let mut x: i32 = 5;
  |              +++++

For more information about this error, try `rustc --explain E0689`.
error: could not compile `playground` (bin "playground") due to previous error

This is the error (playground):

error[E0506]: cannot assign to `x` because it is borrowed
 --> src/main.rs:6:20
  |
6 |     x.add_assign({ x = 1; 2 }); // doesn't work
  |     - ----------   ^^^^^ `x` is assigned to here but it was already borrowed
  |     | |
  |     | borrow later used by call
  |     `x` is borrowed here

The borrows happen in a different order for the operator with built-in types vs the method call. This demonstrates that the operator for built-in types behaves differently from desugaring to the method call.

1 Like

It's not particularly difficult, but it's pervasive, so might matter. Notably, in the MIR level, a function call terminates a basic block which a binary operator on a primitive does not.

So it's not just the function calls, but also whether return a + b + c is one basic block or four.

EDIT: Brainfart; ignore this

Also, Add::add takes references, so it's the difference between

c = a + b

and

bb1:
let temp1 = &a;
let temp2 = &b;
let temp3 = Add::add(temp1, temp2) goto bb2 cleanup bb3;

bb2:
c = temp3;

bb3:
resume;

And the references are much harder to remove, practically, than the inlining. (As always, indirection is harder than by-value.)

2 Likes

It doesn't, it takes operands by value.

Doh, you're of course correct there.

What I wrote is why it's important that < be a primitive, not why + should be.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.