... unless optimizer collapses some loop into counter += N
. Improbable, but not impossible.
Relevant reference.
Another advantage that Rust's smart pointers have is that they follow the Rust ABI, which allows them to be passed like other normal pointers. C++'s smart pointers are notably not zero cost due to being considered like structs and always passed on the stack instead of registers.
Do you have any example showing that?
Sure, see for example Compiler Explorer. In the Rust code passing Box
to an opaque function is a no-op, but passing a unique_ptr
in C++
will involve a whole bunch of operations. Among other things (e.g. the code for the destructor due to baz
potentially not destroying the unique_ptr
) you can also notice the write and subsequent read from the addres of rsp
in order to pass the unique_ptr
to bar
(and also the read of the value from the address at rdi
, which was where foo
's caller put the unique_ptr
).
There is also a kinda famous talk that goes in depth on this problem if you're interested https://www.youtube.com/watch?v=rHIkrotSwcc&t=1049s
This is fascinating; I believe Iâve even seen that video before, but only this time I realized the important detail about C++ here that I had never known before:
When you have by-value arguments to functions, the destruction is handled by the caller!
I was already aware that C++âs version of âmoveâ operations works via through move constructors, which donât fully get rid of the original, moved-from object, but instead are just ⌠letâs say ⌠encouraged to rob the original object of all of its resources (especially ownership of memory).
Of course, Rust canât really model normal âconstructorsâ accurately at all, because there is always a move in Rust, never in-place construction; but ignoring this detail, C++ move seems somewhat comparable to mem::take
; the original object stays in place, but is left in some kind of cheap dummy state. (Of course mem::take
is different in that itâs generally a very clearly defined state; whereas a moved-out state in C++ is often only promising some âunspecified but valid stateâ be left behind.)
This of course already has some down-sides; e.g. all objects that you want to move need some additional sort of null
-like state. Anyways⌠with that knowledge, Iâve always kind-of assumed that
fn pass_on_the_box(x: Box<Foo>) {
other_function(x);
}
when "translated" to C++ would become more like the (moral) equivalent of
fn pass_on_the_box(mut x: Option<Box<Foo>>) {
other_function(mem::take(&mut x));
}
Now, our things became nullable; and x
will also still be dropped at the end of the function (then containing a None
value).
Really, I should have thought about this longer â the destructor argument makes little sense! Without knowing anything about other_function
; as long as the type (Option<Box<Foo>>
) is known, the compiler can optimize this code; and after inlining the âmove constructorâ (mem::take
) and the destructor (drop glue of Option<Box<Foo>>
), it should easily be able to spot that x
will be None
after mem::take
, and dropping this a no-op. And also Rust has similar behavior already, anyways, since with so-called "drop flags", all variables do, technically, have Option
-like properties; an additional flag[1] that tracks initialization status, and a conditional destructor call at the end of their scope[2].
But alas, it all makes much more sense now! The issue is: In order to let the caller handle the destruction (instead of the callee), C++ pretty much does the (moral) equivalent of
fn pass_on_the_box(x: &mut Option<Box<Foo>>) {
other_function(&mut mem::take(x));
}
when calling with by-value arguments. And this also finally makes the produced assembly very comparable! (Essentially identical, actually.) (Removing the noexcept
from @SkiFire13's example, because this reproduction in Rust handles the unwinding case, too. The version with noexcept
could be compared with the Rust version compiled with -C panic=abort
)
Still, Iâm having a hard time figuring out any of the benefits of this approach.
I have found out a few things already, such as
- temporaries are dropped at the end of the full expression, and by-value arguments are somehow also temporaries? Not sure how much of this is prescribed in the standard, so complexity w.r.t. temporaries and/or standard compliance might be issues
- changing it now is clearly ABI-breaking, andâŚ
- âŚalternatively, introducing only the option of callee-destruction for certain types and/or arguments, would have surprising effects
Don't feel bad about it. No single human understands all of how C++ works or how its parts interact in strange ways. Not even Bjarne Stroustrup.
I pray Rust will not endlessly accrete complexity like that into the future.
I believe I learned that from the same video and then promptly immediately forgot it again. Hopefully your memory survives longer than mine!
Editions help, but this is getting off-topic.
[this used to be a footnote, but itâs a bit long for Discourceâs rendering style]
With editions, we can remove much complexity from the language, because backwards-compatibility concerns are much less limiting. Of course it doesnât help in all cases, but it does in many.
One good example could be the current work on match ergonomics. The end goal is to make match ergonomics more intuitive and simple. Match ergonomics themselves were an addition to the original story of Rust patterns (where you would need to handle all references with &
or &mut
patterns and then decorate variables with ref
or ref mut
binding modes.
Then Rust got a fully backwards compatible update, âmatch ergonomicsâ that allows you to leave out those &Struct { field: ref x, .. }
annotations and match s: &Struct
directly against a Struct { field: x, .. }
style pattern.
Turns out, that design wasnât perfect, and has multiple flaws; some of them perhaps preventable, but much of the issue is the pretty high complexity of the exact rules; and especially the often surprising subtle consequences; the system works through implicitly tracking, a property throughout all patterns, which is called "default binding mode" (compare the Reference and the relevant RFC).
Now, the 2024 edition allows us a redesign to a simpler methanism ⌠well maybe letâs wait with that determination until the exact design has been chosen ⌠but the point is to make the behavior more intuitive, especially in many (semi-) corner cases. (Iâm personally a fan of Nadrieril
âs ideas/(unfinalized?)proposal in this context. I think there is a lot of value in finding an approach that âessentiallyâ avoids the notion of "binding modes" entirely.)
[I personally wouldnât be surprised long-term, if we can get rid of ref mut
and ref
patterns entirely. Just give more powers to references⌠if they could infer and track borrows, and perhaps a notion of â&move
â-reference.]
Another example is 2024-edition changes to temporary scopes. Itâs a change that might be a little bit breaking even despite edition support[1] â but even when itâs not always completely smooth and automatic to migrate your code, most importantly all existing library code keeps working and can be imported without issues! And as long as this is ensured, even fairly fundamental changes can be made to Rust, especially â as is the case here too â if they serve to remove complexity from certain language rules, by working either towards simpler rules, or at least towards ones with a more intuitive effect.
(Last, but not least: The safety of Rust of course also means that in many areas, complexity is much less bad! The problem in C++ is that you the programmer are supposed to understand it all, how long each object lives, what steps are necessary to ensure thread-safety, where are the 10s or 100s of completely unnecessary extra ways in which you can achieve UB, like i++ + i++
; or signed integer overflow; or something like 3? 6? 10? different ways one could initialize a variable with insanely arbitrary rules & interactions, especially wr.t. the effect of zero-initialization vs uninitialized.)
macros are always a bit hard with editions; and in this case, the migration isnât perfect; certain code canât be directly represented in edition-2024 code at all (yet?) âŠď¸
So very true. But since this is something I had to deal with very intimately, at some point⌠I can add some clarifications.
And, as usual with C++, you have missed some peculiar, but important details.
Which standard are you talking about? There are few relevant ones here.
Indeed.
This is something that's called, in the relevant standard, non-trivial for the purposes of calls and means precisely what is says on the tin.
Since you have used compiler that followed that standard and since having a destructor disqualifies type from being passed by value⌠you are observing what you are observing.
But if you compiler doesn't follow that standard (e.g. MSVC doesn't follow it) then object would be passed by value and it would be destroyed by the callee, not caller.
Yes. And that's why compilers that developed their ABI before C++ (and this before std::move
) couldn't change their behavior: before rvalue references and std::move
there was really no way to move object into the function, the most language could do is to copy it. And that's also what MSVC does if your type doesn't have a move constructor. Note that if you do have a move constructor then it's not even used! But the mere presence of move constructor gives compiler an option to move object and call the destructor in the callee function.
Sadly Itanium C++ ABI wasn't altered in time and compilers that follow it (means all UNIX systems, in practice) are stuck with this odd and peculiar behavior.
What surprising effects are you talking about? Option does exist in clang, but it's opt-in, because it breaks the compatibility.
Since MSVC is popular enough compiler most programs work just fine in that mode, too.
Ah, I wonder if somebody can tell me why there is an "Itanium C++ ABI"?
Seems very odd to me given that Itanium does not exist and almost never exited. I, and nobody I know, have never seen one.
Also, I thought ABI's depended on what registers were available in processors. What does it mean to have an Itanium C++ ABI for x86, ARM, RISC V, whatever? For C++ or any other language?
Could you? My understanding that what happened to C++ is pretty much inevitable and editions only help for the language users (because you can say that âcrazy behaviorâ stays in the past and introduce âbetter modern behaviorâ) â and then only when they don't have to deal with crates compiled for old Rust editions.
Actual simplification of the language may only happen down the road when some editions wouldn't just be deprecated, but would fully removed.
Are there even plans to do that?
Ah, that was an observation stated in the FAQs of that video above; if I recall it correctly. I canât personally judge whether itâs really that surprising, but essentially the effects can be that function arguments are destroyed in a weird order. (Iâm not personally familiar with any canonical examples of C++ code where that order matters, so I canât judge.)
Ah, I guess âcomplexityâ is too vague. I understood @ZiCogâs reply about âendlessly accrete complexityâ to mainly target complexity from the point-of-view of a language user. Though I canât be sure.[1]
Compilers are complex anywaysâŚ[2] and you are of course correct that editions canât remove any complexity from the language as a whole.[3] I am not aware of any deprecation plans for old editions. So far, Iâm not aware of any concerns at all, that they might be too much effort to maintain long-term.[4]
A new syntax design, like [the parsing ambiguity fix & stronger initialization guarantees that came with] braces for constructors; or something like -> âŚ
-style return types on functions, wouldnât be added as alternatives, but eventually replace the original syntax. For example, dyn Trait
didnât exist before 2018; but then with the 2021 edition, old trait object syntax is completely âremovedâ.
And to be fair, appeared in the context of destructors of by-value arguments; and that detail probably isnât super relevant for most users (beyond the slight negative performance issues). âŠď¸
any many rules of programming languages arenât actually that complex; write them down properly and it shouldnât really overwhelm a compiler author âŠď¸
and indeed the old editions are still fully âpart of Rustâ; just the normal user who might still have some Rust-2015 code in a dependency, shouldnât have to worry about Rust-2015 any more than about the C language for any of the C libraries that are being linked by his or her dependencies. âŠď¸
Not surprising IMO, given that compilers are by design machines that translate (in multiple steps even) feature-rich surface language into more simple&uniform internal representations; editions fit right into this framework. âŠď¸
For C++ or any other language?
I can only answer about C++ because any other language would have to decide for themselves how and why they would adopt it.
Itanium C++ ABI is designed for C++, after all.
Seems very odd to me given that Itanium does not exist and almost never exited. I, and nobody I know, have never seen one.
You are suffering from the post-knowledge. We know, in a year 2024, that Itanium would be a huge flop that would lead nowhere. But in year 1999 (and Itanium ABI was developed in 1999, right after ISO C++98 standard was published, scroll down the page to the revision history)?
It was a new hotness that was supposed to replace, literally, everything: Alpha, PA-RISC, PowerPC, SPARC, and x86, too! Heck, Windows XP 64-Bit Edition only works on Itanium, it doesn't work on x86-64
. Sure, a few years later, in year 2005 Windows XP Professional x64 Edition would be released, but for a few years Operons had to use Linux if someone wanted a 64bit, because 64bit Windows was developed exclusively for Itanium and Microsoft still waited for that death of everything else!
Lots of things that we use today were developed exclusively for Itanium, in the beginning: EFI and GPT, among other things.
And since most Unix vendors planned to replace their CPUs with Itanium⌠they developed Itanium C++ ABI (C++ have got it's first standard just published, but it wasn't including ABI).
But they haven't started in vacuum, of course.
Also, I thought ABI's depended on what registers were available in processors.
Sure, System V ABI supplements do that â but they only describe how C ABI works, because, well, in year 1983 when System V arrived C++ kinda-sorta haven't existed.
What does it mean to have an Itanium C++ ABI for x86, ARM, RISC V, whatever?
There are no âItanium C++ ABI for x86â. âItanium C++ ABIâ is a supplement to âSystem V ABIâ⌠developed specifically for Itanium, but it's delegates all the gory details of how arguments are placed in registers to the System V ABI - IA-64 Architecture Processor Supplement or to the System V ABI - Intel386 Architecture Processor Supplement or to the x86-64 psABIâŚ
Ah, I wonder if somebody can tell me why there is an "Itanium C++ ABI"?
Because it's C++ ABI developed for Itanium-based system by a consortium of companies that were prepared to switch from their own proprietary architectures to Itanium⌠why wouldn't they call it âItanium C++ ABIâ?
Itanium was supposed to supplant everything⌠in that future (that never happened) nobody would ever wonder why official C++ ABI is called âItanium C++ ABIâ: it ABI for the only surviving CPU, why would it be called anything else?
Alas, history went in the other direction and now people are puzzled and surprised by that name⌠but it was pretty much an obvious name for what it was when it was developed.
I canât personally judge whether itâs really that surprising, but essentially the effects can be that function arguments are destroyed in a weird order.
It's not really that surprising (after all MSVC does that already), it's just time to do that change was when rvalue references and move constructors were added: since pre-C++11 classes couldn't have move constructors any type that declares them should be ready to deal with the new order of calls to destructors.
But because that opportunity was missed and C++ doesn't want to break backward compatibility⌠yeah, that's precisely the type of change that Rust editions address.
Very-very close to how Rust have changed the drop order to support let chains.