Cost of move vs. borrow?

I feel like this might have been brought up before, but I couldn't find anything definitive about it... and just wanted to make sure my mental model is right.

What I'm thinking is, a move (of a smart pointer like Box) is slightly more expensive than a reference since it is copying the pointer, but since the pointer is basically just a number on the stack, it's very fast - so ultimately the cost of a move vs borrow is negligible?

3 Likes

A borrow is also moving a pointer, because references are just pointers in memory. Think of Box as an owned pointer.

4 Likes

So in other words, is it the difference between these?

  1. A reference to a Box: moving a pointer (which happens to point to something which is also itself a pointer, but that's just circumstantial - a reference doesn't care what it points to, it's always just a pointer)
  2. An owned Box: moving a pointer (because a Box is a pointer so moving it is, well, moving a pointer)

I get it's slightly more complicated than that because we're not distinguishing between raw pointers vs. smart pointers - but is that roughly right? Therefore there's no performance impact between passing a reference to a Box vs. passing ownership of a Box, because in both cases we're just moving a pointer?

Sorry for being a bit overly pedantic - just want to get this right since it seems like a core thing to understand and I'm not coming to Rust from C++

Yes that's right

Note that you can get a reference out of a Box due to deref coercions, so you won't see a reference to a Box in the wild.

2 Likes

A Box is a real pointer that always occupies real memory. A Borrow is a virtual pointer, which in Debug builds compiles to a real pointer but in Release builds is often optimized away by the compiler.

4 Likes

I think Rust (in its LLVM fork) also teaches LLVM about existence of Rust's allocator, so in some cases Box may be optimized out too.

Nevermind, I don't actually see it happening.

1 Like

Box is not a smart pointer (in the C++ sense). For plain data of known size it's a plain "dumb" pointer. Like the one you get from malloc(), except it can come from Rust's own allocator.

For trait objects and unsized objects, Box and references (e.g. Box<str> and &str, or Box<dyn Trait> and &dyn Trait), are not pointers but 2-usize-large structs holding pointer and length or data pointer and a vtable pointer.

In general:

  • Move of types smaller than a pointer (e.g. u8) is cheaper than passing of references to them.
  • Move of Box copies the same amount of information as passing an equivalent &/&mut.
  • For Vec and String the move is slighly more expensive since there's an extra word to copy, and &Vec/&String cost double indirection to access, so in these cases &[]/&str win.
7 Likes

Hmm, Box is a smart pointer in C++ sense, even for plain data. The fact it manages the memory (eg allocates on new and frees on drop) is already smart enough :slight_smile:.

4 Likes

I’ve never seen that - is there an example, short of where the optimizer can see the box isn’t used at all?

The cost of a move is the same as of a copy. The actual cost depends on size of data being moved/copied. Borrowing just means you’re creating a reference that will then be copied.

3 Likes

C++ has a few smart pointer types, including shared_ptr. Also std::move is not the same as Rust's move (if used right it may optimize down to the same thing, but it also allows cases where it has a cost). I wanted to be clear that there isn't such cost to Box and the "smartness" is compile time.

unique_ptr is the analog to Box. But Box isn’t like raw memory (ie ptr) from malloc.

2 Likes

I recall seeing an LLVM patch that makes it know that Rust's allocator methods are analogous to malloc, and I know LLVM is able to optimize out calls to malloc.

Do you mind explaining this a bit more? I don't yet understand why String is different than Box<str>

Thanks for the breakdown btw!

In pseudocode:

struct String {
    chars: Box<[u8; self.capacity]>, // same as chars: *mut u8
    size: usize,
    capacity: usize,
}

and Box<str> is

struct BoxOfStr {
    chars: Box<[u8; self.size]>, // same as chars: *mut u8
    size: usize,
}

and &str is:

struct RefOfStr {
    chars: *mut u8,
    size: usize,
}
3 Likes

In a sense &T and Box<T> are the same, it's just that Box<T> is guaranteed to point to a T on the heap, and &T can point to a T that's anywhere. But also, they aren't all the same, when you drop(my_box) it also drops my_box's reference's contents that are on the heap, but when you drop a &T you just drop the pointer and that's it.
Here's an example demonstrating their sizes

2 Likes
#[derive(Debug)]
struct Y(u128);

fn main() {
    let a = Y(14);
    let b = a;
    println!("b: {:?}", b);
}

In the above code should a be copied to b ? Can the compiler optimise this ?

The compiler will easily optimize this. To keep the assembly clearer, here is a snippet that involves simply returning the u128, rather than going through the formatting gook. You can see the compiler, starting at opt-level=1, will simply forward the 14 out (via two registers, to accommodate the u128 size).

In general, you can assume the compiler can eliminate superfluous moves/copies - the only thing noticeable to you will be the semantics attached to moves vs copies (i.e. whether the source of the operation is still usable/valid after the move/copy occurs).

6 Likes
#[derive(Debug)]
struct Z {
    a: i128,
}

fn check_address() {
    let z = Z{a: 10};
    println!("check_address {:p}", &z);
    inner(z)
}

fn inner(x: Z) {
    println!("inner {:p}", &x);
}

A more involved example: if compiler is going to optimize move without an actual copy, then I suppose x must point to the same address as z ?

In the compiled code structs may not even exist. LLVM is able to see when the code doesn't depend on struct fields being laid out in memory, and take the struct apart as if every field was a separate variable.

1 Like