I wrote a solution for the parentheses generation problem. If I use print! without a newline, my code is 3x slower faster than when I use println!() or print!() with a newline. I looked at the flamegraphs, and the code with the newlines has a much wider print section. I tried instantiating and locking stdout myself and pass around, but that changed nothing.
Can any one offer any advice? I've put the code on the playground and you can run it on your own machine with $ echo 14 | cargo run --release -q. (Try changing line 10.)
stdout contains a LineWriter, so it flushes on every newline character written.
It doesn't look like there's an easy way around this. One pretty bad idea would be to use e.g. AsRawFd/FromRawFd to make your own writer for STDOUT. This is a bad idea because you have to guarantee that stdout has been flushed and that stdout and print are never used for as long as your writer exists; otherwise, writes from your writer and writes to stdout would appear out of order.
stdout is line buffered (so it flushes when it prints a newline, or when it's internal buffer is full). The lock on stdout is to stop multiple threads from intermixing their output. Calling the lock explicitly avoids the per print! lock overhead, and allows multiple print! from different threads to not intermix (as opposed to each print! independently).
Line buffering is pretty standard on C/Go etc, but in at least C if you pipe the output to a file it'll switch off line buffering (since it can on some systems can detect that when it's using a tty). I don't think Rust does that, and is always line buffered for stdout. The easiest work around is to open a file and just write!/writeln! into it. You can also handle errors in the write! calls which would be panics for print!.