What caught my attention is the reference to an i32 literal &1 or &2 over here. I understand references as pointers to a physical location in memory, in fact Chapter 4.2 of the Brown Book defines references as:
References are non-owning pointers, because they do not own the data they point to.
1 or 2 is data, it is not a variable, here we are not even checking assert_eq!(v1_iter.next(), Some(&v1[0]));. I can understand this working for string literals because string literals (based on my memory layout knowledge from C/C++) are stored in the data section of the memory, but I haven't heard of this happening for numerical literals, in fact, it doesn't even make sense to do so. So, what does this reference even mean theoretically?
Bonus question
: One thing which is implicit here is that if assert_eq is passed references, it dereferences them automatically, i.e.
let num1 = 42;
let num2 = 42;
let p1 = &num1;
let p2 = &num2;
assert_eq!(p1, p2); // This passes
If references truly are pointers, then this should fail, because p1 stores the address of num1 and p2 stores the address of num2, but it passes, explaining my automatic dereference claim. Then how do I assert the equality of two references which point to the same address and not the value at the respective addresses, i.e.
let num1 = 42;
let num2 = 42;
let p1 = &num1;
let p2 = &num2;
let p3 = &num1;
assert_eq!(p1, p2); // This should fail
assert_eq!(p1, p3); // This should pass
It's not automatic per se, it's due to how the equality traits are implemented -- to be about the equality of the referents (which one usually cares about) and not of the addresses (which one usually does not).
addr_eq is the answer mostly, but note that it ignores wide pointer metadata like the length of a slice, and may also be misleading when zero-sized types are involved. Also it's not uncommon to return an arbitrary address for an empty slice, empty Vecs don't allocate by default, etc.
Also, in Rust, one mostly does not compare references (or any other pointer type) by address. Nearly every trait that is implemented for references (and Box, Rc, Arc, etc.) looks at the referents, not the addresses. This is closely related to how the semantics of movability and &mut mean that Rust really does not have a notion of “object identity”. You can, under some circumstances, build it if you have a use for it, but it’s not natural or inherent.
But that's exactly what it does. A reference like &1 puts a literal 1 in the .rodata section and then takes the address of it. Why would you think that it works differently from strings or really any other type of constant data.
Perhaps, because pure numeric literals, that is without a &, or numeric expressions, like (1+2) are typically not stored explicitly in a data segment. Numeric literals are sometimes referred to as rvalues, while variables, to which can be assigned a value, are referred as lvalues. See
In a situation where you have a value and treating it in such a way that you expect a place, a temporary variable will be created.
This should actually work the same in C/C++, the linked page calls this a “temporary object”, and C++ lingo for what Rust calls a “value expression” is generally some variation of “rvalue” in C/C++ – wheras Rust’s “place expression” corresponds to “lvalue”. Don’t ask me about the more detailed distinctions in C++ for different flavors of lvalue/rvalue, like “prvalue” or “glvalue” on that cppreference page, I don’t have any of this memorized either and it’s probably not too important.
(The “r” and “l” in these names correspond to the syntax of assignment. A basic assignment operation foo = bar has an lvalue on the left-hand side, and an rvalue on the right-hand side.)
When using a value expression in most place expression contexts, a temporary unnamed memory location is created and initialized to that value. The expression evaluates to that location instead, except if promoted to a static. The drop scope of the temporary is usually the end of the enclosing statement.
Now, for your concrete example at hand, &1 using a literal 1 - which is a value expression - and takes a reference - which is a “place expression context” (i.e. a context that expects a place expression[1], so the rule could apply; and the behavior you can see matches that.
However Rust also has a mechanism[2] of “promoting” a value instead. Promotion to a static can happen in some very specific circumstances, and it’s a feature that was added to Rust a bit later (i.e. not present in Rust 1.0) and designed in a way that it should generally not have very noticeable effects on code that doesn’t depend on it.[3]
So for learning purposes, you’re completely fine ignoring all the details about what and when gets promoted to live in the “data section” of your binary etc… and just handle things like &1 as if they really just use a temporary variable, and place the value 1 right into that temporary. Rust is deliberately designed so that you’re allowed to ignore this promotion. Anyway, if you are curious, let’s continue:
Code can actually depend on the static/constant promotion, because promoted references will gain a 'static lifetime which you can observe through the borrow checker (thus you can write code that would fail to compile if the promotion didn’t happen), e.g.
const fn add_one(n: i32) -> i32 {
n + 1
}
// this fails to compile (as a safety measure; a custom function like `add_one` may fail
// to evaluate [panicking; or non-termination] and we must not implicitly turn a compile-time
// error into a run-time error)
fn reference_the_plus_one() -> &'static i32 {
&add_one(123) // error[E0515]: cannot return reference to temporary value
}
// this works, but doesn't use promotion
fn use_reference_to_the_plus_one_locally() {
println!("{}", &add_one(123)); // prints 124
}
const fn add_one(n: i32) -> i32 {
n + 1
}
// this way you opt into evaluation of `add_one` at
// compile-time explicitly and we allow it to work again!
fn reference_the_plus_one() -> &'static i32 {
&const { add_one(123) }
}
fn also_works() -> &'static i32 {
const { &add_one(123) }
}
fn also_also_works() -> &'static i32 {
&(100 + const { add_one(23) })
}
// how to do things without (or before we had) `const {}` blocks
fn also_works_with_const_item() -> &'static i32 {
const N: i32 = add_one(123);
&N
}
yea, i suppose normally they'd be directly encoded as an immediate in the instruction if possible, but anything that has static lifetime and needs an address taken will be in .data, .rodata, or .bss.
I don’t see the number in the compiled assembly, where exactly is it stored? I changed it to 1981945 to make it easier to spot, but I still can’t find any instruction showing it as a static. Can you point it out? Thanks!
Because numeric literals can just be part of our code, most ISAs support rich instructions having literals as one of the operands, so it makes more sense to just have them as part of the code, rather than having them in the memory and loading them with expensive load instructions at runtime. However, string literals are much longer and I have never seen Assembly which operates directly on string literals, they are always operated on character-by-character after loading them from the data section of the memory.
It's not automatic per se, it's due to how the equality traits are implemented -- to be about the equality of the referents (which one usually cares about) and not of the addresses (which one usually does not).
Understood! So the 'custom' equality trait implemented for reference types is implemented such that is checks for the dereferenced values.
addr_eq is the answer mostly, but note that it ignores wide pointer metadata like the length of a slice, and may also be misleading when zero-sized types are involved. Also it's not uncommon to return an arbitrary address for an empty slice, empty Vecs don't allocate by default, etc.
Thank you very much, I'll go through the documentation of std::ptr and it's methods like some of the participants in this thread have suggested!
Yes, thanks for sharing, what I meant to ask was, 'overriding' assert_eq! to handle references in the way I want, i.e. comparing the addresses, but I understand that it is not possible as it would involve modifying library code.
I had to google a bit to find out that \371 and \036 are octal numbers. I wrote a little program to confirm that .asciz "\371=\036" indeed has the same binary representation as the little endian, two complement representation of 1981945.
Often, you can get somewhat close to this effect by instead modifying the types involved (with light-weight wrapper types). E.g. the following can be used in assert_eq to mark references that should be compared by address:
use std::fmt;
// let's add a custom `Debug` impl, so that a failing `assert_eq`
// gives us a better clue where the difference lies
impl<T: fmt::Debug> fmt::Debug for ByAddr<'_, T> {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
f.debug_struct("ByAddr")
.field("at_addr", &fmt::from_fn(|f| write!(f, "{:p}", self.0)))
.field("value", self.0)
.finish()
}
}