Performance issues with returning tuples

Rust cannot directly return multiple values ​​like Golang, and can only be simulated with tuples. Does initializing a tuple incur more performance overhead?

You've got it backwards. What Go is doing is special casing tuples to be only usable as return values. (they rebranded tuples as "multiple return values")

Tuples, in Rust and other languages which have tuples, are conceptually the same as multiple return values in Go but can be used anywhere.

There shouldn't be any performance difference. A tuple is just a struct with unnamed fields, and vice versa, a struct is just a tuple with named fields, after compilation they are the same.

9 Likes

Short answer: no.

Longer answer: You might be thinking about a tuple like some kind of “object” that needs to be allocated, initialized, passed by-reference, later unpacked and freed. It isn’t. Structs and tuples in Rust directly consist of the data in their fields, no indirections involved whatsoever. On the implementation-level then (unless maybe if you build with no optimizations), smaller structs are often simply returned from functions in parts (each field individually in a register), while larger structs/values (once the data doesn’t fit the (easily) available registers anymore), it’s returned by putting it in an appropriate area on the stack. Or a combination of the two. That’s the exact same approach as how multiple function arguments are passed when a function is called.

So, no, tuples don’t incur “overhead” and neither does any other struct that groups together multiple values.


In case you like to see some concrete example, look at this compilation output:

It compiles

pub fn h2() -> (u32, u32) {
    (f(10), g(10))
}

into

example::h2:
        push    rbx
        mov     edi, 10
        call    example::f
        mov     ebx, eax
        mov     edi, 10
        call    example::g
        mov     edx, eax
        mov     eax, ebx
        pop     rbx
        ret

As you can easily see, it looks like f and g, which are fn(u32) -> u32 functions, accept their parameter in edi and return the result in eax. Then, h2 operates by saving the result of f in ebx temporarily (to make eax available again) and after calling g, movies the result of g into edx and the stored result of f back into eax. So it looks like in this case the u32 was returned in two registers: the first u32 in eax and the second in edx. I would simply call this behavior “multiple return values”, intuitively speaking; the tuple is “gone” here – and for further illustration you can see how the exact same thing happens with structs such as struct Point { x: u32, y: u32 }, too!

By the way, many standard library types are structs. E.g. a String consists of 3 usize-size values for the position, length, and capacity, so a function fn(…) -> String will usually return its results by using 3 registers (apparently this happens not actually in registers, see answer below). You can also see liberal use of inline(never) in my examples below, if you click the links, since inlining would get rid of the function calls, and thus the passing around of arguments and return values entirely.

9 Likes

IIRC only two usizes are returned in registers (a Scalar or ScalarPair); three are always returned via the stack.

1 Like

Ah, maybe I should have tested that to double check :innocent:

Does using struct to return multiple values in C not need allocated, initialized, passed by reference, later unpacked and free?

No, structs can be passed and returned by value in C, just like Rust, and there is no malloc. C never implicitly calls malloc.

2 Likes

It depends on the calling convention, but from memory a function result that does not fit in registers is generally pre-allocated by the caller at a "known" location on the call stack ( just before the return address is pushed by the call instruction ) and for the called function is effectively a local variable at a known stack offset, so doesn't need to be passed by reference. If the result is subsequently used as an argument for another function call, it will already be in the right position on the stack. This is the general idea of a "stack machine", albeit many function parameters end up being passed in registers. [ All this may not be quite right... but it is what I remember from looking at the calling conventions a few months ago, my memory is not good! ]

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.