Is a Type That's 1024 bytes More Efficient to Reference or Copy?

It really depends on access patterns and probably also on a bunch of platform-specific stuff like ABI and cache line size. There are no guarantees.

However, in many real-world examples it's unlikely to make a major difference, if any at all. Which of these functions makes fewer copies?

struct Foo([u8; 1024]);

pub fn with_foo(callback: fn(Foo)) {
    callback(Foo([0; 1024]))
}
// vs.
pub fn with_foo(callback: fn(&Foo)) {
    let foo = Foo([0; 1024]);
    callback(&foo)
}

Trick question! These compile to the exact same assembly.

Which of the following functions makes fewer copies?

pub fn get_foo() -> Foo {
    Foo([1; 1024])
}
// vs.
pub fn get_foo(out: &mut Foo) {
    *out = Foo([1; 1024]);
}

Trick question again! Both versions are passed an out pointer and compile to a single memset. (Due to ABI limitations, the second one is actually slightly better by an O(1) term, but not in a way that survives inlining.) Check it out on Compiler Explorer.

(Note that implementing Copy has no effect on code generation. Copy determines what kinds of code you can write, but it doesn't influence what valid code compiles to once written.)

In conclusion, there are no guarantees, but if you're fiddling about with references and out pointers in order to save copies, you should definitely be profiling every change to be sure the difference is what you think.

11 Likes