A bit confused what a variable is

Just following this example from the interactive book:

fn main() {
let mut x: Box<i32> = Box::new(1);
let a: i32 = *x;         // *x reads the heap value, so a = 1
*x += 1;                 // *x on the left-side modifies the heap value,
                         //     so x points to the value 2

let r1: &Box<i32> = &x;  // r1 points to x on the stack
let b: i32 = **r1;       // two dereferences get us to the heap value

let r2: &i32 = &*x;      // r2 points to the heap value directly
let c: i32 = *r2;    // so only one dereference is needed to read it
}

* is "clicking" on a reference, following a heap pointer to the value. And also allows updating in this case. From that, one would say x is simply the reference.

x is the variable that binds the instance of Box<i32>, which is located in the stack, but it has some properties etc., and a heap pointer to the numeric result (the i32 value).

  • So is x just a reference that leads to its value when de-referenced, or what would be a good way to think about it (for a person newly introduced to Rust)?

  • Or should I think of x as a pointer to the start of the item in the stack? Or is x a Stack address?

  • Or what?

Ah yes, the syntax is the cause of your confusion and that is understandable.

  • x is the name of a "thing" on the stack. The "thing" here is a Box which is a struct with a pointer to a heap "object". In this case, it is a Box<i32>, so the Box contains a pointer to an i32 on the heap, initialized to 1.
  • *x works because of two traits called Deref and DerefMut. Implementing these traits for your struct makes any "object" of the struct de-referencable using * operator like a struct.
  • In Rust, the * operator also allows you to de-reference a reference.

You can read the docs for the Deref and the DerefMut.

5 Likes

In Rust, the word "reference" usually refers[1] to a shared reference (&_) or exclusive reference (&mut _), so you might want to use broader term like "pointer". I think of Box<_> as an owning pointer.

Rust doesn't distinguish pointers by where they point at the language level, but at the library level a Box guarantees that it points at the heap (or nowhere in particular, for the case of boxing a zero-sized type).

You can conceptually think of it as being the Box pointer which is stored on the stack (and the pointer points at the heap). It's somewhat like the String examples in these diagrams.

r1                 x, *r1             *x, **r1, *r2      r2 (&*x, &**r1)
+-----------+      +-----------+      +-----------+      +-----------+
| &Box<i32> | ---> | Box<i32>  | ---> | i32       | <--- | &i32      |
+-----------+      +-----------+      +-----------+      +-----------+

Dereferencing (like *x) creates a place expression representing the memory position pointed to by x.[2] Depending on the surrounding code, you can either do something with the value at that memory position while leaving it in place, or you can copy or move out of that place.

// Copies out of `*x`.  It would be a move if `i32` did not implement `Copy`
let a = *x;
// Modifies the value in place (without copying/moving)
*x += 1;
// Create a reference to the place without copying/moving the value
let r2 = &*x;

Variables are actually also place expressions.

// Create a reference to the place without copying/moving the value
let r1 = &x;
// This would move the `Box<_>`.
let y = x;

You can read more about place expressions here.


  1. heh ↩︎

  2. It's a built-in operation for references and Box. For other Deref implementors, like Vec<_> or String or Arc<_>, it represents *<_ as Deref>::deref(&variable). deref returns a reference, so this is still representing the memory position pointed to by some pointer. You can generally suppose the Deref implementation is "reasonable" and thus still think of it as what the variable points to (even though implementations can technically return a reference to static memory or such instead). ↩︎

5 Likes

Your aside mentioned Box having deref being built-in, but there's a perfectly reasonable looking Deref for Box here? boxed.rs - source

That's to satisfy trait bounds and direct calls. It's not used for * or that definition would recurse. &_ and &mut _ have implementations too, and they'd also be recursive if * wasn't built-in.

The built-in derefence for Box is what allows moving things out of a Box, tracking uninit state so you can reinitialize a Box, and splitting borrows through a Box. Things you can't do with *Deref::deref(..) or *DerefMut::deref_mut(..).

2 Likes

Ah, I mentally inserted a .0 in there! I'm on very shaky ground about the box magic, especially with the reverted box syntax

That's very helpful (also the other answer, but this clicked for me). Thank you.