Surprising: &1 + &1 == 2

jbe · August 20, 2022, 11:06pm

This is somewhat confusing:

fn main() {
    assert!(&1 + &1 == 2); // works
    //assert!(&1 == 1); // fails
}

(Playground)

I can add two references to integers and get an integer. But I cannot compare a reference to an integer with an integer.

I assume this behavior is intended? But what's the rationale behind it?

Found the responsible macro in the source: forward_ref_binop! in library/core/src/internal_macros.rs and here its usage for addition, but it doesn't say anything about the rationale in either place.

quinedot · August 21, 2022, 1:15am

It's just a weakness in how their implementations were chosen. The Add implementations accommodate one layer of & in either position, whereas the PartialEq lifting over references went for arbitrary depth but ended up with equal-number-of-references-on-both-sides.

Basically, it's hard or impossible to do all of

obey coherence (no overlapping trait implementations)
hit every level of & nesting
not wreck the compiler's ability to infer what implementation to use

If we could have them all (so you never had to think about how many layers of & you had when adding or comparing), we probably would.

The rationale, if it was explicit, is probably based on practicality (but I didn't go looking).

quinedot · August 21, 2022, 1:55am

I mulled this over a little more and realized that at least part of the difference (why Add doesn't lift over references generally) is that Add::add takes its parameters by value and not reference, so you can't lift the general implementation. They could have lifted it for T: Copy I suppose. You don't really want it lifted in T: Clone ( + !Copy) case, as you could end up with unnecessary cloning. I.e. libraries may sensibly choose to have implementations for references that do different things compared to the implementation for non-references.

(Why does Add::add take by value though? So you can have things like Add<&str> for String that also don't needlessly reallocate. (Some would say that's a poor overload anyway, but also consider something like a BigInt.))

jbe · August 21, 2022, 8:58am

Interesting!

fn main() {
    // assert!(&&1 + &&1 == 2); // fails
    assert!(&&1 == &&1); // works
}

(Playground)

This reminds me a bit of the AsRef problem where two different implementations of AsRef for pointer-like types also were conflicting and choices had to be made.

But I wonder if this really a conflict in case of (Partial)Eq. The "one or both sides by-reference" Add implementations are on concrete types through macros, and not using a blanket implementation. The same could (theoretically) be done for (Partial)Eq for certain concrete types like i32: Playground.

jbe · August 21, 2022, 9:05am

I feel like it's handy that PartialEq, Hash, etc. lift over references (with equal levels on both sides). But I personally don't think that 1 + &1 should compile (especially not if it will fail if you later replace the i32 with some other type where these implementations are missing). It rather confused me than helping me (because I meanwhile learned that you sometimes need to (de)reference manually where needed).

forward_ref_binop! and its usage for Add was introduced with PR #21227. Also see RFC 0439 (cmp-ops-reform). But I don't see a rationale/discussion in either of this on why 1 + &1 should compile.

simonbuchan · August 21, 2022, 11:15am

This whole discussion made me curious as to why rust even has arbitrary levels of reference? I can make some guesses, but at least immutable refs don't seem more powerful with multiple levels.

Is there a good source for that sort of question? Some equivalent to The Design And Implementation of C++, where the whole language design history is explained feature by feature?

steffahn · August 21, 2022, 11:33am

&&T is not special. It’s just &S where S happens to be &T. Honestly, I wouldn’t know how Rust could not allow multiple levels of references, unless reference types would not be ordinary types, but then they couldn’t be used in (ordinary) generics either. E.g. putting &str in a Vec<T> (where T becomes &str) is super useful, why would you want to forbid it?

To illustrate: if you don’t forbid Vec<&str>, then something as straighforward as let x = vec!["hello"]; print!("{x:?}"); already works with &&str internally, since it works with Debug for Vec<T> where T: Debug, so Vec<&str>: Debug requires &str: Debug, and &str: Debug involves fn fmt(self: &&str, …).

Admitted, it’s a somewhat common mis-intuition amongst beginners that T1 vs &mut T2 vs &T3 are three mutually exclusive kinds of things, but in reality the first one can be all of them with T1 == &mut T2 or T1 == &T3.

TL;DR, multiple level of reference are most useful in generic contexts when a generic parameter happens to be instantiated by a reference type itself. You wouldn’t commonly work with &&T directly deliberately ^[1], but when connecting different generic abstractions, such types can commonly appear.

unless perhaps in rare occasions, in order to get a smaller reference type, when T: !Sized, or in order to allow another unsizing step, e.g. &&str as &dyn Foo ↩︎

simonbuchan · August 21, 2022, 12:04pm

There's no particular reason (I can see) that Rust couldn't have chosen to make expanding &S where S is &T result in just &T again, the same as &x where x has type &T would have the type just &T again.

I'm not saying it would be better necessarily, but it would put decently less of a burden on libraries for a relatively simple language rule. There's also the precedence of the (fairly different) c++ reference, and would result in less theoretical dereferencing.

On the other hand, rewording you a little, it means we can't just look at &'a T as sugar for (a theoretical) Ref<'a, T> type, which is a nice property that might avoid complications, like needing specialization.

It's why I was interested to see if there's any definitive history for these early decisions - though it's only an idle interest.

VorfeedCanal · August 21, 2022, 2:38pm

I strongly suspect that experience of C++ references have shown why adding such crazy rules is not a good thing. You invariably hit lots of very strange corner cases and eventually end up with huge complication instead of simplification.

But it wouldn't be “a relatively simple language rule”. Because of things like foo<T>(_: &T) -> T which suddenly couldn't return reference anymore you would need to add kinda-sorta-reference-but-not-reference (C++ added std::reference_wrapper in C++11 because they couldn't have reference to reference) and then you would add special rules into the other traits (like C++ added std::unwrap_reference in C++20) and so on.

Thus you would simplify libraries which are already simple and complicate ones which are already complex. Not a great tradeoff long-term.

You can read the some more about these things in this blog post or can look on just any RFC discussion.

If you don't think about what your “a relatively simple language rules” you end up with fractals of bad designs very quickly.

What I find fascinating in Rust is precisely the fact that it's developers, somehow, managed to avoid this trap: many good decisions in Rust were made not because they foresaw long-term consequences but because they had the gut feeling about how this or that “simplification” is just not worth it.

Sometimes they did bad decisions (arguably the fact that Add accepts one level of references but not two while PartialEq accepts any number of them but only if they are balanced) is surprising.

Any one of these two decisions would have been better IMO. But even what Rust have today is clearly better than the proposal to ensure you can not create reference to reference.

quinedot · August 21, 2022, 8:30pm

Yeah that's true, you can always go for concrete types assuming the orphan rules don't kick in or whatever.

Now of course,

    assert!(MyEq::my_eq(&1, &&&1));

still doesn't work... and it doesn't work for other types (e.g. foreign ones) more generally ^[1].

From reading those links, Add used to take references and the compiler introduced the borrows ala method resolution. So while still a breaking change, it made it more compatible with the old design.

Another thing to consider is that Rust, in my experience, tries to hit some "most people's code just works" butter-zone with their language design. This is basically the same sort of thing @VorfeedCanal talks about. It's also what I was refering to with "practicality based rationale".

In this case, it turns some patterns into "didn't have to think about it." On the other hand, the definitely didn't hit every common case, as I'm pretty used to throwing in some & or * when needed.

One difficulty more generally is that different people have different sensitivities to what's intuitive or not, or even just different coding styles making it more or less likely to run into the corner cases. The AsRef etc trait situation appears to really bother you for example, but I only see a couple oddities; however I also have my own collection of "man they really missed the mark" peeves. And when the mark has been missed (by any given perspective), the attempt at simplification has actually resulted in something more complex.

per-type implementations aren't necessarily a net improvement, as you sacrifice consistency (and it's a breaking change to add a blanket implementation ) ↩︎

jdahlstrom · August 21, 2022, 8:53pm

It feels like the most unfortunate thing about Rust references is that they share the name with C++ references, even though they're much more regular types (if not Regular in the Stepanov sense). C++ references are truly strange beasts – they are not objects and thus don't have addresses or occupy memory in the abstract machine, for one. Reference type constructors don't compose like all other type constructors but rather have their own arcane rules, and so on.

VorfeedCanal · August 21, 2022, 10:42pm

My own pet peeve is that references use mut keyword and are called mutable and immutable.

While in reality the much more appropriate names would be shared and unique (with appropriate keyword).

Because the fact that mutable reference allows you to change object is almost like a side effect from the fact that they are unique: it's like tree in a forest, if you changed state but there are no one around who can notice such change have the state been actually changed?

In some sense Rust have come to the same point where modern Haskell resides, only Rust looks like normal, imperative, language where Haskell introduces state via such a crazy pile of abstractions that most developers just couldn't penetrate it.

jbe · August 21, 2022, 10:46pm

But isn't &1 + &1 == 2 also working due to a "per-type implementation"? I.e. it will fail for many other types than i32. (Playground)

Yes, I merely wanted to mimic Adds behavior (and show that it doesn't conflict with making PartialEq lift over references).

Neither does &x + &x == x in general.

I just noticed you demonstrated the same here:

Hmmmm… I see.

I would like to cite PR #28811 here, which introduced those implementations of AsRef that worry me so much:

These trait implementations are "the right impls to add" to these smart pointers and would enable various generalizations such as those in #27197.

This is where I think more caution might be advised. Implementations that feel "right" or seem handy can backfire badly later; especially because Rust's type system comes with some limitations and because of strict stability guarantees (which require extra caution when adding such things). ^[2]

That AsRef thing still bugs me a lot. Click to expand (slightly off-topic for this thread).

Some more updates on the topic (unless you haven't already seen):

Some arguments by me why I believe PR #28811 has been a mistake (apart from conflicting with this old FIXME)
PR #99460 by me, trying to at least improve documentation on the AsRef issue. (But no updates for a while? Not sure if anyone can help push this forward or reject it with a good reason?)

Getting back to &1 + &1 == 2, this is something that doesn't bother me nearly as much as the AsRef issue does , so I guess I can relate (I still find it a bit odd though). But I wouldn't be surprised if it causes trouble that isn't apparent for me yet.

At the least, it really confused me that day when I didn't understand why my code was working. I'm really used to the fact that references to i32 generally don't act the same as i32s, because I often run into this problem:

fn main() {
    let mut v = vec![4, 11, 3, 10];
    v.retain(|x| x >= 10);
    // needs one of:
    // v.retain(|x| x >= &10);
    // v.retain(|x| *x >= 10);
    // v.retain(|&x| x >= 10);
    println!("{v:?}");
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
error[E0308]: mismatched types
 --> src/main.rs:3:23
  |
3 |     v.retain(|x| x >= 10);
  |                       ^^
  |                       |
  |                       expected reference, found integer
  |                       help: consider borrowing here: `&10`
  |
  = note: expected reference `&_`
                  found type `{integer}`

For more information about this error, try `rustc --explain E0308`.
error: could not compile `playground` due to previous error

As a beginner, I was annoyed by this error, but now I'm confused by &1 + &1 == 2 not causing an error.

One more remark: What I find particularly counterintuitive is that the type changes. &123 + &0 has a different type than &123. (Playground)

And yet another remark: Thinking on various physics and math teachers in my life repeating over and over "Achtet auf die Einheiten!" ("Pay attention to the units!"), I would conclude that you shouldn't (be able to) compare an integer with a reference to an integer (but comparing two references of equal level seems okay, to me at least).

per-type implementations aren't necessarily a net improvement, as you sacrifice consistency (and it's a breaking change to add a blanket implementation) ↩︎
I explained in the update to this other post why the referenced use-case of AsRef<[u8]> in #27197 isn't a good argument in my opinion. ↩︎

VorfeedCanal · August 21, 2022, 11:26pm

Why is this counterintuitive? Addition creates new object, thus of course references have to go and we get owning object. How can sum of two things be a reference?

I'm not sure I like the fact that I can add two references (looks a bit strange to me, actually), but if it would produce something at all then it shouldn't be a reference.

That would be truly strange and bizzare.

quinedot · August 21, 2022, 11:30pm

Yes, and it's only consistent with other numerical types in core due to always supplying the four implementations per pair of operands. And it will be inconsistent with your own custom implementations unless you supply four implementations. The concerns around consistency are more severe with PartialEq and Eq in my opinion, as it's common to #[derive] them... which would not give you the two extra "offset by one level of reference" implementations.

I'm really used to the fact that references to i32 don't act the same as i32s, because I often run into this problem:

Same, though I hit that so early in my Rust learning I'm just used to it as I alluded to. I'm not saying it's a great situation. Again the ideal would be for comparisons to see through all layers of references, and I think that's what we would have if there was a coherence-friendly way to get it. It would be a breaking change at this point, though I could imagine a world where it's done anyway if few people are doing some non-expected thing in the ecosystem.

One more remark: What I find particularly counterintuitive is that the type changes. &123 + &0 has a different type than &123.

Well, how could it not? I guess this makes sense if you're not expecting to be able to add references at all, but given that you can, you would need interior mutability to return a modified &i32.

More generally, the trait isn't any of

trait Add { fn add(self, _: Self) -> Self }
trait Add<Rhs> { fn add(self, _: Rhs) -> Self }
trait Add<Rhs> { fn add(self, _: Rhs) -> Rhs }

If it were, you would lose the ability to add different types at all (with the first variation), or the ability to make such pairings symmetric (with the other two).

simonbuchan · August 22, 2022, 3:56am

For an abundance of clearness, I totally agree that it works a lot better the way it currently does: I just found it interesting as to why the decision was made the way it was for the reasons I mentioned.

The RFC list "only" goes back to 2014, at which point Rust was already very much the language we know today.

I did find this fascinating documentation for 2012's Rust 0.1, though, which includes this section on "pointer types" - even that very different looking version of Rust seems to be functionally the same with it's reference behavior.

jbe · August 22, 2022, 6:14am

Agreed, returning a reference makes no sense.

I understand that + can be used as an operator that takes two things of any type and returns something of a different type (depending on the left and right input type). So it shouldn't be seen strictly numerical and basically could describe any binary operation.

However, being able to write a reference where instead a value is expected is usually not allowed in Rust. There are some exceptions:

If methods are declared with &self, then it's possible to use the method on the value or on a reference to Self; but this isnt' the case when the method is declared to take self. It also doesn't work on arguments, so it's pretty specific to the receiver of a method. (Playground)
For Copyable types, methods that take self can be called on a reference (or reference to a reference, etc). But this doesn't hold for arguments (Playground) or operators in general.
My love/hate case of AsRef: Due to this blanket implementation it's possible to pass &x (or &&&&&&x) instead of x when x is to be expected of type AsRef<T>. (Playground) (Side note: I'm happy with that, but unfortunately this behavior hasn't been implemented for smart-pointers, which I'm unhappy about, and which causes some missing implementations today. For example Cow<'_, str> can't implement AsRef<Path>; see PR #73390, which had to be rejected. See also issue #45742, which attempts to fix this issue and goes back to a very old TODO in std.)
(Partial)Eq, (Partial)Ord, and Hash are automatically implemented on &T where implemented on T. This is very useful if we want to build a HashSet or HashMap using references! (Playground)
… and maybe more cases that I missed?

I don't see how allowing to add &1 and &1 is comparable to any of these cases. Comparing it with method resolution, forward_ref_binop! is non-generic and only subject to particular numeric types from std. It also doesn't involve method receivers (at least not on the right hand side), so it would be more consistent to disallow it there. It also doesn't enable you to do anything like the Eq/Ord/Hash implementations on references, which serve a true purpose (I think). And the AsRef case is currently is messed up anyway.

Or to phrase it different: I don't see why I should be able to specificially add 1 and &1 and retrieve a 1, or why &1 + &1 should result in 2, if I can't add one level of references on either side of + in the generic case. I don't see how the concept of "addition" should be inherent/specific to "references to numeric types".

Beside being not able to add references in the general case (which I'm okay with), I think the counterintuitive part is:

Add takes two operands by value and returns a result. You implement Add to indicate that two values (of a certain type) can be added. We all know what adding two numbers means. But what's the semantics of adding a number to a reference? In C that's pointer arithmetic, for example. In Rust, I would claim it doesn't make much sense. Also from a mathematical perspective it doesn't make much sense (to me). And there's no general agreement in Rust that you can always pass a &x where an x is expected (not even for Copyable types).

That said, it's not really bothering me much that these implementations exist, but I still find them confusing. The problems with AsRef are much bigger (in my opinion) and cause trait implementations not being existent where they should be, as in the case of the previously mentioned PR #73390 (even if they arguably that can often be worked around by adding some explicit dereferencing).

jbe · August 22, 2022, 6:35am

I think the only "real" issue with allowing to add references to integers is that it can cause confusion when teaching Rust or learning Rust. This is hard to explain to a beginner:

fn main() {
    let v = vec![2, 4, 6, 8];
    let mapped = v.iter().map(|x| x + 5); // works
    let filtered = v.iter().filter(|x| x > 5); // requires double(de)referencing
}

(Playground)

The fact that v.iter().map(|x| x + 5) works leaves the learner with the impression that references, values, etc. don't really matter. Then, when you get to .filter, the issue hits you even harder, because you're suddenly dealing with a double layer of referencing, and here the compiler (and std lib) is very strict. There is no "easy" explanation why it's lax in the first but strict in the second case.

Well, and the second issue is that things fail to work when using different types, which can also be confusing:

#[derive(Copy, Clone)]
struct S;

impl std::ops::Add<i32> for S {
    type Output = Self;
    fn add(self, _other: i32) -> Self {
        S
    }
}

fn main() {
    let _ = vec![2, 4, 6, 8].iter().map(|x| x + 5); // works
    let _ = vec![S].iter().map(|x| x + 5); // requires explicit dereference
}

(Playground)

tczajka · August 22, 2022, 6:42am

For what it's worth, I also find it confusing that this impl exists.

When I was just starting using Rust, it took me a long while before I understood that it's because of the extra impls rather than some magical property of references. It made references more difficult to understand for me.

This makes the code work if you forget to dereference something, but to me that's more of a problem than a feature. The issue comes back if you try to pass the reference to some other function or if you have two levels of reference, which only makes it more confusing when it had compiled before.

I much prefer to have clarity on the types from the start.

jbe · August 22, 2022, 6:44am

I totally agree.

I understand these implementations can't be removed, but maybe they could be deprecated by adding lints that warn you if you forget to (de)reference, like in this case:

    let v = vec![2, 4, 6, 8];
    let mapped = v.iter().map(|x| x + 5); // works

Maybe that's a discussion for IRLO at some point?

Topic		Replies	Views
What is &2? Or &999? Why?	4	696	November 13, 2019
Confusion about the add operate help	4	570	June 9, 2023
Beginner: What does the number reference (e.g. `&2`) mean? help	5	1068	October 10, 2020
Feedback on the new "borrowing" section	5	1413	January 12, 2023
How to fix this sample?	8	410	May 20, 2023

Surprising: &1 + &1 == 2

Related Topics