How do I think about references to numerical literals?

I am reading Chapter 13.2 of the Brown book and I came across Listing 13-12:

    #[test]
    fn iterator_demonstration() {
        let v1 = vec![1, 2, 3];

        let mut v1_iter = v1.iter();

        assert_eq!(v1_iter.next(), Some(&1));
        assert_eq!(v1_iter.next(), Some(&2));
        assert_eq!(v1_iter.next(), Some(&3));
        assert_eq!(v1_iter.next(), None);
    }

What caught my attention is the reference to an i32 literal &1 or &2 over here. I understand references as pointers to a physical location in memory, in fact Chapter 4.2 of the Brown Book defines references as:

References are non-owning pointers, because they do not own the data they point to.

1 or 2 is data, it is not a variable, here we are not even checking assert_eq!(v1_iter.next(), Some(&v1[0]));. I can understand this working for string literals because string literals (based on my memory layout knowledge from C/C++) are stored in the data section of the memory, but I haven't heard of this happening for numerical literals, in fact, it doesn't even make sense to do so. So, what does this reference even mean theoretically?

Bonus question

: One thing which is implicit here is that if assert_eq is passed references, it dereferences them automatically, i.e.

let num1 = 42;
let num2 = 42;
let p1 = &num1;
let p2 = &num2;
assert_eq!(p1, p2); // This passes

If references truly are pointers, then this should fail, because p1 stores the address of num1 and p2 stores the address of num2, but it passes, explaining my automatic dereference claim. Then how do I assert the equality of two references which point to the same address and not the value at the respective addresses, i.e.

let num1 = 42;
let num2 = 42;
let p1 = &num1;
let p2 = &num2;
let p3 = &num1;
assert_eq!(p1, p2); // This should fail
assert_eq!(p1, p3); // This should pass

So, what does this reference even mean theoretically?

I am unsure if this falls under constant promotion or temporary lifetime extension or a combination of both.

Bonus question

This is due to this PartialEq impl

impl<A, B> PartialEq<&B> for &A
where
    A: PartialEq<B> + ?Sized,
    B: ?Sized,

You can get hold of the memory address, but this is not normally a useful thing to do.

See std::ptr - Rust

Godbolt says that the numerical literal can be stored as a static in the assembly.

specifically for address equality there is the function addr_eq in std::ptr - Rust

Why not?

It's not automatic per se, it's due to how the equality traits are implemented -- to be about the equality of the referents (which one usually cares about) and not of the addresses (which one usually does not).

addr_eq is the answer mostly, but note that it ignores wide pointer metadata like the length of a slice, and may also be misleading when zero-sized types are involved. Also it's not uncommon to return an arbitrary address for an empty slice, empty Vecs don't allocate by default, etc.

Also, in Rust, one mostly does not compare references (or any other pointer type) by address. Nearly every trait that is implemented for references (and Box, Rc, Arc, etc.) looks at the referents, not the addresses. This is closely related to how the semantics of movability and &mut mean that Rust really does not have a notion of “object identity”. You can, under some circumstances, build it if you have a use for it, but it’s not natural or inherent.

But that's exactly what it does. A reference like &1 puts a literal 1 in the .rodata section and then takes the address of it. Why would you think that it works differently from strings or really any other type of constant data.

Perhaps, because pure numeric literals, that is without a &, or numeric expressions, like (1+2) are typically not stored explicitly in a data segment. Numeric literals are sometimes referred to as rvalues, while variables, to which can be assigned a value, are referred as lvalues. See

Reddit - Please wait for verification

So the basic principle is usually the following:

In a situation where you have a value and treating it in such a way that you expect a place, a temporary variable will be created.

This should actually work the same in C/C++, the linked page calls this a “temporary object”, and C++ lingo for what Rust calls a “value expression” is generally some variation of “rvalue” in C/C++ – wheras Rust’s “place expression” corresponds to “lvalue”. Don’t ask me about the more detailed distinctions in C++ for different flavors of lvalue/rvalue, like “prvalue” or “glvalue” on that cppreference page, I don’t have any of this memorized either and it’s probably not too important.

(The “r” and “l” in these names correspond to the syntax of assignment. A basic assignment operation foo = bar has an lvalue on the left-hand side, and an rvalue on the right-hand side.)

The corresponding definition in the Rust reference states

Temporaries

When using a value expression in most place expression contexts, a temporary unnamed memory location is created and initialized to that value. The expression evaluates to that location instead, except if promoted to a static. The drop scope of the temporary is usually the end of the enclosing statement.


Now, for your concrete example at hand, &1 using a literal 1 - which is a value expression - and takes a reference - which is a “place expression context” (i.e. a context that expects a place expression[1], so the rule could apply; and the behavior you can see matches that.

However Rust also has a mechanism[2] of “promoting” a value instead. Promotion to a static can happen in some very specific circumstances, and it’s a feature that was added to Rust a bit later (i.e. not present in Rust 1.0) and designed in a way that it should generally not have very noticeable effects on code that doesn’t depend on it.[3]

So for learning purposes, you’re completely fine ignoring all the details about what and when gets promoted to live in the “data section” of your binary etc… and just handle things like &1 as if they really just use a temporary variable, and place the value 1 right into that temporary. Rust is deliberately designed so that you’re allowed to ignore this promotion. Anyway, if you are curious, let’s continue:

Code can actually depend on the static/constant promotion, because promoted references will gain a 'static lifetime which you can observe through the borrow checker (thus you can write code that would fail to compile if the promotion didn’t happen), e.g.

(Click to expand examples.)
fn reference_that_literal() -> &'static i32 {
    &42
}

fn reference_that_sum() -> &'static i32 {
    &(40 + 2) // this also works!
}

(playground

const fn add_one(n: i32) -> i32 {
    n + 1
}

// this fails to compile (as a safety measure; a custom function like `add_one` may fail
// to evaluate [panicking; or non-termination] and we must not implicitly turn a compile-time
// error into a run-time error)
fn reference_the_plus_one() -> &'static i32 {
    &add_one(123) // error[E0515]: cannot return reference to temporary value
}

// this works, but doesn't use promotion
fn use_reference_to_the_plus_one_locally() {
    println!("{}", &add_one(123)); // prints 124
}

(playground)

const fn add_one(n: i32) -> i32 {
    n + 1
}

// this way you opt into evaluation of `add_one` at
// compile-time explicitly and we allow it to work again!
fn reference_the_plus_one() -> &'static i32 {
    &const { add_one(123) }
}

fn also_works() -> &'static i32 {
    const { &add_one(123) }
}

fn also_also_works() -> &'static i32 {
    &(100 + const { add_one(23) })
}

// how to do things without (or before we had) `const {}` blocks
fn also_works_with_const_item() -> &'static i32 {
    const N: i32 = add_one(123);
    
    &N
}

(playground)


  1. which makes sense, since to create a pointer, you need a place to point to :wink: ↩︎

  2. actually two mechanisms, @Bruecki mentioned them already; the other mechanism “temporary lifetime extension” I will not cover in this reply ↩︎

  3. This design goal makes is a relatively complex feature alltogether, I think this document is a good place to get a really in-depth overview. ↩︎

yea, i suppose normally they'd be directly encoded as an immediate in the instruction if possible, but anything that has static lifetime and needs an address taken will be in .data, .rodata, or .bss.

I don’t see the number in the compiled assembly, where exactly is it stored? I changed it to 1981945 to make it easier to spot, but I still can’t find any instruction showing it as a static. Can you point it out? Thanks!

This works for the special std::ptr type, but what about the reference type?

Why not?

Because numeric literals can just be part of our code, most ISAs support rich instructions having literals as one of the operands, so it makes more sense to just have them as part of the code, rather than having them in the memory and loading them with expensive load instructions at runtime. However, string literals are much longer and I have never seen Assembly which operates directly on string literals, they are always operated on character-by-character after loading them from the data section of the memory.

It's not automatic per se, it's due to how the equality traits are implemented -- to be about the equality of the referents (which one usually cares about) and not of the addresses (which one usually does not).

Understood! So the 'custom' equality trait implemented for reference types is implemented such that is checks for the dereferenced values.

addr_eq is the answer mostly, but note that it ignores wide pointer metadata like the length of a slice, and may also be misleading when zero-sized types are involved. Also it's not uncommon to return an arbitrary address for an empty slice, empty Vecs don't allocate by default, etc.

Thank you very much, I'll go through the documentation of std::ptr and it's methods like some of the participants in this thread have suggested!

Could you please quote a source to this? Also, it feels like an inefficient way to handle things (I've explained why in my reply to @quinedot 's post)

You have to coerce the reference to a pointer type first. Section 1 here.

have you not looked at the documented example ? this absolutely works for references

Examples

use std::ptr;

let whole: &[i32; 3] = &[1, 2, 3];
let first: &i32 = &whole[0];

assert!(ptr::addr_eq(whole, first));
assert!(!ptr::eq::<dyn std::fmt::Debug>(whole, first));

you can also use eq in std::ptr - Rust if you want only the same type, like the example shows

Yes, thanks for sharing, what I meant to ask was, 'overriding' assert_eq! to handle references in the way I want, i.e. comparing the addresses, but I understand that it is not possible as it would involve modifying library code.

Sure, it is stored here:

.Lanon.55c7f93690ae5f30547443bb3bd8fa7b.1:
        .zero   4

The first line being a label referring to 4 zeroed bytes created with .zero 4.

If we change 0 to 1981945, the value of the label changes to

.Lanon.55c7f93690ae5f30547443bb3bd8fa7b.1:
        .asciz  "\371=\036"

I had to google a bit to find out that \371 and \036 are octal numbers. I wrote a little program to confirm that .asciz "\371=\036" indeed has the same binary representation as the little endian, two complement representation of 1981945.

Often, you can get somewhat close to this effect by instead modifying the types involved (with light-weight wrapper types). E.g. the following can be used in assert_eq to mark references that should be compared by address:

pub struct ByAddr<'a, T>(pub &'a T);

impl<T> PartialEq for ByAddr<'_, T> {
    fn eq(&self, other: &Self) -> bool {
        std::ptr::eq(self.0, other.0)
    }
}
impl<T> Eq for ByAddr<'_, T> {}
#[test]
fn iterator_demonstration() {
    let v1 = vec![1, 2, 3];

    let mut v1_iter = v1.iter();

    // success
    assert_eq!(v1_iter.next(), Some(&1));

    // success
    assert_eq!(v1_iter.next().map(ByAddr), Some(ByAddr(&v1[1])));

    // fails
    assert_eq!(v1_iter.next().map(ByAddr), Some(ByAddr(&3)));
}
use std::fmt;

// let's add a custom `Debug` impl, so that a failing `assert_eq`
// gives us a better clue where the difference lies
impl<T: fmt::Debug> fmt::Debug for ByAddr<'_, T> {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        f.debug_struct("ByAddr")
            .field("at_addr", &fmt::from_fn(|f| write!(f, "{:p}", self.0)))
            .field("value", self.0)
            .finish()
    }
}
test iterator_demonstration ... FAILED

failures:

--- iterator_demonstration stdout ----

thread 'iterator_demonstration' (225) panicked at src/lib.rs:36:5:
assertion `left == right` failed
  left: Some(ByAddr { at_addr: 0x5ad6dd0264d8, value: 3 })
 right: Some(ByAddr { at_addr: 0x5ad6dc80c87c, value: 3 })

(playground)