Efficiency of `println!` and `format!`

format! allocates a new heap(?) String for us, which can be further passed to a println!:

let thing: String = format!("I have {} cats.", 6};
println!("My message is: {}", thing);

But of course we could have just:

println!("My message is: I have {} cats.", 6);

Is the former necessarily less efficient than the latter?
Let's add a new element, then:

fn declaration(msg: String) {
  println!("My message is: {}", msg);
}

and who knows where that external String actually came from.
Will Rust/LLVM see what's happening in either case, and optimize down to something nice? I suppose I feel this way every time I use a String or &str as an argument to format! or println!. Are we punished for the "String embedding"?

Almost certainly. The formatting infrastructure is an optimization barrier, so it is unlikely that the compiler will be able to see through the intermediate String allocation.

Note that you can also use format_args! to create an intermediate object without allocations, if the circumstances allow this.

In general, the formatting infrastructure is not very performant, so if it shows up in profiles, I suggest making use of String::with_capacity and String::push_str to avoid it. To format integers/floats, the itoa and dtoa crates provide much faster alternatives than the default formatting methods. ufmt is also something to look at if you need more performance.

9 Likes

Something to understand that format!() allocates a new String and writes to it, while println!() will write each bit of the formatted message to std::io::stdout() directly. So I don't believe it'll make any allocations (unless a type's Display impl allocates internally).

Both methods go through Rust's formatting machinery, which is primarily designed to be flexible and easy to use, making use of things that don't tend to optimise well (e.g. dynamic dispatch).

That said, I don't think printing to stdout using the builtin print statement is the fastest way to generate output in any language. You've got things like the OS's native line buffering, locking (to make sure two threads don't generate interleaved output), no buffering on the program side (syscalls are often more expensive than writing to a temporary buffer and printing in bulk) etc.

14 Likes

It seems the first call to println! does actually cause allocations. However, subsequent calls don't. So probably println! itself doesn't allocate, but maybe some subsystem does when initializing.

This code shows 3 allocs.

stdout() is buffered. eprintln! shows 0 allocations.

1 Like

Thanks for pointing that out! Indeed, calling stdout() shows 3 allocations. And if stdout() is called one time before tracking allocs, then the println! shows 0.

I recently discovered that writing directly to a File (as an instance of Write) is very inefficient, and that wrapping it further in a BufWriter can vastly improve performance. In my own case I saw a 20x speed up. When I profiled both approaches, the former was making many many calls to C-based file writing, while the latter wasn't.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.