So since it does then, I have a question. Wouldn't using a reference of an int be slower than just copying it? Is a reference a pointer, if not how does it work? Ex:
fn foo(int& num) {
println!("{}", num);
}
In my questioning implentation, I believe this to be slower than a non-ref because int is so small that it would be faster than declaring (I think) a pointer.
It usually is. That's why, if you have a small Copy
type (that is, almost any Copy
type, except some rare cases like [T; N]
with T: Copy
and N
on the order of thousands), it's almost always better to pass it around by-value, not borrow it.
wdym by order of thousands?
Actually, I think,In Rust, due to the existence of ownership systems, the efficiency of all value and reference transfers can be guaranteed, and the compiler will optimize them to a large extent, making the efficiency difference between them very small.
Usually, Rust tries to avoid unnecessary data replication during function calls, so passing parameters as references is usually an efficient way. Meanwhile, Rust's borrowing checker ensures that any data access that occurs during reference passing is secure, which also helps improve program performance and security.
given that the efficiency difference is small, which is faster, copying or refing an int?
unless you are in a generic context, this is not something worth bothering, just choose whatever you feel comfortable. for primitives like integers, it's usually easier to just write i32
instead of &i32
(or even &mut i32
).
I would like to know, this level of hyperoptimization is required for the task i'm doing.
Due to optimizations and the intricacies of modern hardware, it is completely impossible to give a blanket statement. Usually there's no difference at all.
They should be identical when optimizations are applied, but as always, it depends. As an extremely trivial example, this code:
pub fn square(num: u64) -> u64 {
num * num
}
pub fn square_ref(num: &u64) -> u64 {
num * num
}
pub fn main() {
let first = 5;
let second = first + 5;
let first_sq = square(first);
let second_sq = square_ref(&second);
println!("{first_sq}, {second_sq}")
}
compiles to this assembly:
assem::main:
sub rsp, 104
mov qword, ptr, [rsp, +, 8], 25
mov qword, ptr, [rsp, +, 16], 100
lea rax, [rsp, +, 8]
mov qword, ptr, [rsp, +, 24], rax
mov rax, qword, ptr, [rip, +, _ZN4core3fmt3num3imp52_$LT$impl$u20$core..fmt..Display$u20$for$u20$u64$GT$3fmt17h01178fb6651e0facE@GOTPCREL]
mov qword, ptr, [rsp, +, 32], rax
lea rcx, [rsp, +, 16]
mov qword, ptr, [rsp, +, 40], rcx
mov qword, ptr, [rsp, +, 48], rax
lea rax, [rip, +, .L__unnamed_2]
mov qword, ptr, [rsp, +, 56], rax
mov qword, ptr, [rsp, +, 64], 3
mov qword, ptr, [rsp, +, 88], 0
lea rax, [rsp, +, 24]
mov qword, ptr, [rsp, +, 72], rax
mov qword, ptr, [rsp, +, 80], 2
lea rdi, [rsp, +, 56]
call qword, ptr, [rip, +, _ZN3std2io5stdio6_print17heb973b84961ff1dbE@GOTPCREL]
add rsp, 104
ret
My knowledge of assembly is extremely poor, but If i'm not mistaken the compiled code skips all function calls and multiplication altogether and simply loads in an immediate 25 and 100 within the first three instructions. The rest is just printing to the console.
Note: if the print statement is removed, main is compiled into a single instruction: ret
. i.e. "you're not doing anything with these numbers, so I'll ignore them and return without doing anything".
Now obviously this wouldn't happen if we didn't use static numbers, but the point is on a micro optimization level like this, the compiler has a lot of freedom in how it compiles.
Regarding the small differences in efficiency between value based and reference based transmission, it is necessary to design based on the project requirements and the actual situation of the equipment, and no one can provide a definite answer. For the efficiency required in the task, you can try to modify it from the algorithmic aspect of the code implementation, rather than focusing on drawing conclusions about the differences in the use of passing by value and passing by reference.
In general, you need to design and run benchmarks, then use them to test different approaches on real data using the same hardware the code will run on.
Also, how do you know the optimiser isn't already picking the "right" approach for you automatically? If you really need that level of optimisation, you have presumably already learned assembly for your target platform, right?
If it were possible to summarise universally applicable "hyperoptimisation" techniques in a forum post, that information would already be baked into the compiler and you wouldn't need to know it. And hey, maybe it already is.
(But in super, super general terms: smaller is faster, fewer pointers is faster, which of those two is faster depends, so run benchmarks.)
If you require it, measure it. That's the simple answer. Everything else is guesswork.
They mean a large chunk of copy type.
[T; N]
with T: Copy
and N
on the order of thousands)
[T,4321] would probably be too big to send as a copy.