Inherently inefficient calling convention in Rust?

vmgolovin · April 10, 2020, 5:31am

Because I want the callee to use the object in full and drop it.
I also saw some examples where moving was used extensively to implement type states, but if I had to make all the functions inline, doesn't it ruin incremental compilation of a multi-crate project?

Yandros · April 10, 2020, 8:40am

If you mean that this is the same as {*foo}.x (move out of the box, then partially move out of the struct (temporary)), know that:

struct Point {
    x: Vec<()>, // glorified non-`Copy` usize :P
    y: Vec<()>,
    z: Vec<()>,
}

fn check (it: Box<Point>)
{
    let _x = (*it).x;
    let _y = (*it).y;
}

compiles fine, meaning there is indeed a partial move out of the boxed struct.

kornel · April 10, 2020, 10:55am

Moving that just reuses the address, and needs to ensure that the original hasn't been changed or moved sounds like… borrowing!

Semi-serious: could Rust run an optimization pass of actual borrow-checker on moves and optimize moves to references where it can see it's safe?

vmgolovin · April 10, 2020, 11:59am

OK, maybe I asked the question the wrong way. My thoughts were that moving in rust is a zero-overhead abstraction, which, for example, can be used such a way: Typestate Programming - The Embedded Rust Book
It seems, I was wrong

kornel · April 10, 2020, 12:03pm

It is zero-overhead after LLVM optimizes it down to zero.

Everything that Rust compiler does has a ton of overhead and garbage in debug mode, and then in release mode LLVM is given the task of cleaning all of that up down to zero.

vmgolovin · April 10, 2020, 12:18pm

The OP example was compiled with optimizations enabled.
All that bothers me is that inlining seems to be critical for such simple optimizations. This is basically means that big Rust programs that want to be effective are doomed to compile eternally, since no incremental compilation is possible.

Ixrec · April 10, 2020, 12:43pm

It seems worth restating with emphasis that you can already make Rust do pass-by-reference just by putting &s in your function signatures. Big programs are certainly not "doomed to compile eternally", except in the trivial sense that we're all human and "doomed" to make some mistakes we won't know to fix until we run a profiler, and optimizers will always have to choose whether the runtime benefit of an optimization outweights the compile-time cost.

In terms of improving compile-time for big projects, this is among the lowest hanging fruit, so I'm sure every project that cares has consciously done this ages ago where it matters. From what I've read, they typically spend much more time worrying about things like how many dependencies they have, what macros they're using, monomorphization of generics, and how to break up large bottleneck crates into smaller crates that can be built in parallel.

G2P · April 10, 2020, 2:48pm

#[inline(never)] disables the basic transformation that allows all other compiler optimizations.

vmgolovin · April 10, 2020, 3:19pm

I write this with assumption that the function is on a crate boundary.

Hyeonu · April 11, 2020, 12:41am

Functions can be inlined even across the crate boundary if

it is a generic function.
it has #[inline] attribute attached.
LTO is enabled when compiling binary, including the thinLTO.

pftbest · April 12, 2020, 1:08pm

That doesn't seem like a good strategy to me. Compile time is finite, so the optimizer always has a limit of how much garbage it can remove and how much code it can inline. So the bigger you project gets the higher the chance that some code will not be optimized fully.

I think ideally it should be a language feature, not an optimization, ie the copy should be elided even in debug builds with no optimizations enabled.

For example some time ago I noticed that calling a boxed FnOnce closure always copies its content from heap to the stack before the call. A similar C++ code doesn't do any copies even with -O0.

https://github.com/rust-lang/rust/issues/61042

kornel · April 12, 2020, 1:10pm

The current state is temporary. Rust is working on MIR optimizations that will remove such things before LLVM.

Hyeonu · April 12, 2020, 1:45pm

I don't think it's desirable to increase the language's specification for performance tweak, without actual runtime performance improved. With optimization we can archive both concise language spec and fast runtime performance. Because the optimization doesn't change the observable behavior of the program except the performance.

But it's true that the LLVM is notably slow to optimize verbose code. That's why we're also working on MIR-level optimizations as @kornel mentioned. MIR has more informations than the LLVM IR like unmonomorphized generics and lifetime, optimization in this stage may improve the compile time.

jjpe · April 12, 2020, 3:11pm

Would it be possible for you and @kornel to give a ballpark idea of when such optimizations can be stabilized?

jschievink · April 12, 2020, 3:39pm

Well, first someone has to invest the time/money to actually implement them. One part of the picture is doing copy elision on MIR, which I'm currently playing around with, but I'm just doing that in my free time so I certainly won't give any estimates for that.

I know that there are also some ideas floating around for adjusting function ABIs to do copy elision for Result-returning functions, but I don't think work on that has started yet.

system · July 11, 2020, 3:39pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Is there 'Copy Elision' in Rust? help	4	2764	January 26, 2020
Why is move operation invoking memcpy here?	3	1044	January 2, 2020
The way to see how memcpy and alloca introduced by move is optimized help	3	368	October 12, 2023
Can I trust Rust to optimize move semantics?	4	5971	January 12, 2023
Cannot call clone automatically	12	953	January 19, 2021

Inherently inefficient calling convention in Rust?

Related Topics