(co-authored by @bugaevc and @YaLTeR)
Zero-cost abstraction is one of the major selling points of Rust, along with memory safety and others. In fact, it is listed first in the Rust features list on the rust-lang.org frontpage.
Zero-cost abstraction means abstraction without overhead. This post in the official Rust blog states that zero-cost abstraction is a core principle of Rust and cites the following definition given by Bjarne Stroustrup:
What you don’t use, you don’t pay for. And further: What you do use, you couldn’t hand code any better.
There are many things in Rust the language (such as null pointer optimization, traits, closures and iterators) and its ecosystem that do indeed provide pleasant and easy-to-use abstractions without imposing any overhead.
But this is, unfortunately, not always the case.
Here’s an example of a high-level abstraction that does have a hidden cost:
#[macro_use]
extern crate serde_json;
fn main() {
let obj = json!({
"hello": "world",
"nums": [42, 35]
});
serde_json::to_writer(std::io::stdout(), &obj)
.expect("Failed to serialize json into stdout");
}
If the abstractions provided by json and serde were indeed zero-cost, we would expect this code to compile in release mode (with LTO additionally enabled) to something very similar to what this code compiles to:
use std::io::prelude::*;
fn main() {
let message = b"{\"hello\":\"world\",\"nums\":[42,35]}";
stdout().write_all(message)
.expect("Failed to serialize json into stdout");
}
Furthermore, we would expect a simple
print!("a");
to have no additional cost over
let res = libc::write(1, b"a", 1);
if res < 0 {
panic!("some error message");
}
None of this is the case. In fact, serde_json performs heap allocation, and accessing stdout has to deal with Arc<ReentrantMutex<RefCell<LineWriter<Maybe<StdoutRaw>>>>>
and it is not optimized away even if we only output one line and never create a second thread.
As another example, something like
&format!("{} {} {} {} {}", 1, 2, 3, 4, 5)
keeps the whole formatting code intact to be run at runtime.
And these are just a few examples of how non-zero cost Rust frequently is. We do pay for what we don’t use (line buffering, runtime formatting or JSON encoding, concurrent stdout access), and we can easily hand-code it to be better (not that we would want to do that, because it leads to uglier code).
It seems that in most cases, these optimizations are prevented by the reluctance of Rust (or LLVM as its component) to inline function calls. Notably, many functions in Rust’s standard library (for example, iterator adapters) are marked #[inline]
, and that seems to be the reason why they are a zero-cost abstraction.
Other functions like printing could require a threading analysis by the compiler to determine if a mutex or a TLS variable is only used by a single thread.
This leads us to the following questions:
• What are the reasons for a compiler not to inline a function that is only called once? This neither increases the code size nor negatively impacts caching.
• Are these missed optimizations considered an issue for Rust? (They should be, since it’s a violation of its core principle.) Should they be documented?
• Is anybody planning to do anything about it?
• Why isn’t LTO enabled by default in release builds?
• When is it a good idea to manually mark methods as #[inline]
?