Inherently inefficient calling convention in Rust?

Because I want the callee to use the object in full and drop it.
I also saw some examples where moving was used extensively to implement type states, but if I had to make all the functions inline, doesn't it ruin incremental compilation of a multi-crate project?

If you mean that this is the same as {*foo}.x (move out of the box, then partially move out of the struct (temporary)), know that:

struct Point {
    x: Vec<()>, // glorified non-`Copy` usize :P
    y: Vec<()>,
    z: Vec<()>,
}

fn check (it: Box<Point>)
{
    let _x = (*it).x;
    let _y = (*it).y;
}

compiles fine, meaning there is indeed a partial move out of the boxed struct.

1 Like

Moving that just reuses the address, and needs to ensure that the original hasn't been changed or moved sounds like… borrowing!

Semi-serious: could Rust run an optimization pass of actual borrow-checker on moves and optimize moves to references where it can see it's safe?

3 Likes

OK, maybe I asked the question the wrong way. My thoughts were that moving in rust is a zero-overhead abstraction, which, for example, can be used such a way: Typestate Programming - The Embedded Rust Book
It seems, I was wrong :frowning:

It is zero-overhead after LLVM optimizes it down to zero.

Everything that Rust compiler does has a ton of overhead and garbage in debug mode, and then in release mode LLVM is given the task of cleaning all of that up down to zero.

5 Likes

The OP example was compiled with optimizations enabled.
All that bothers me is that inlining seems to be critical for such simple optimizations. This is basically means that big Rust programs that want to be effective are doomed to compile eternally, since no incremental compilation is possible.

It seems worth restating with emphasis that you can already make Rust do pass-by-reference just by putting &s in your function signatures. Big programs are certainly not "doomed to compile eternally", except in the trivial sense that we're all human and "doomed" to make some mistakes we won't know to fix until we run a profiler, and optimizers will always have to choose whether the runtime benefit of an optimization outweights the compile-time cost.

In terms of improving compile-time for big projects, this is among the lowest hanging fruit, so I'm sure every project that cares has consciously done this ages ago where it matters. From what I've read, they typically spend much more time worrying about things like how many dependencies they have, what macros they're using, monomorphization of generics, and how to break up large bottleneck crates into smaller crates that can be built in parallel.

6 Likes

#[inline(never)] disables the basic transformation that allows all other compiler optimizations.

4 Likes

I write this with assumption that the function is on a crate boundary.

Functions can be inlined even across the crate boundary if

  1. it is a generic function.
  2. it has #[inline] attribute attached.
  3. LTO is enabled when compiling binary, including the thinLTO.
2 Likes

That doesn't seem like a good strategy to me. Compile time is finite, so the optimizer always has a limit of how much garbage it can remove and how much code it can inline. So the bigger you project gets the higher the chance that some code will not be optimized fully.

I think ideally it should be a language feature, not an optimization, ie the copy should be elided even in debug builds with no optimizations enabled.

For example some time ago I noticed that calling a boxed FnOnce closure always copies its content from heap to the stack before the call. A similar C++ code doesn't do any copies even with -O0.

https://github.com/rust-lang/rust/issues/61042

2 Likes

The current state is temporary. Rust is working on MIR optimizations that will remove such things before LLVM.

2 Likes

I don't think it's desirable to increase the language's specification for performance tweak, without actual runtime performance improved. With optimization we can archive both concise language spec and fast runtime performance. Because the optimization doesn't change the observable behavior of the program except the performance.

But it's true that the LLVM is notably slow to optimize verbose code. That's why we're also working on MIR-level optimizations as @kornel mentioned. MIR has more informations than the LLVM IR like unmonomorphized generics and lifetime, optimization in this stage may improve the compile time.

2 Likes

Would it be possible for you and @kornel to give a ballpark idea of when such optimizations can be stabilized?

Well, first someone has to invest the time/money to actually implement them. One part of the picture is doing copy elision on MIR, which I'm currently playing around with, but I'm just doing that in my free time so I certainly won't give any estimates for that.

I know that there are also some ideas floating around for adjusting function ABIs to do copy elision for Result-returning functions, but I don't think work on that has started yet.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.