Strange write behavior from Stdout

While converting a random trivial tool from Python to Rust, I wanted to improve the performance by doing larger I/O. So I wrapped the input and output in BufReader/BufWriter:

  let mut f = BufReader::with_capacity(1024*1024, f);
  let mut g = BufWriter::with_capacity(1024*1024, std::io::stdout());     

What I'm (kind of obviously) expecting here is to see a smooth sequence of 1MB reads and writes. Indeed when I strace it, the reads are good:

read(3, "\0\377\377\377\377\377\377\377\377\377\377\0\0\2\0\2\0\0\10\0\0\0\10\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
read(3, "\v\0\1\4\377\3\1\4\v\0\1\4\377\3\1\4\v\0\1\4\377\3\1\4\v\0\1\4\377\3\377\7"..., 1048576) = 1048576
read(3, "\27\tD\2\10\4\6\5\364\344\0\0X^\10\t\0\0dJ\0\0\0\0\354\302\7\371R\5\6\t"..., 1048576) = 1048576
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576

But the writes are, at first blush, all over the place:

write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1047804) = 1047804
write(1, "\17#\4\256\24\0Q2\356\363\21\3\0324\35\362\335 \360\37\24\0\332 %\374\322.\26\372\324T"..., 772) = 772
write(1, "@\0\5 \302\36\23\356\357^\25\16\313A\20#@\0\256B\377\360\nF\20\316\f\24@\340\r\23"..., 1048259) = 1048259
write(1, "\0\20\0\1\0\0\0!\4!\4!\4!\4!\4!\4!\4!\4!\4!\4!\4!\4!"..., 317) = 317
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0 \2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\2\0\0"..., 1040391) = 1040391

When I analyzed the output further, I found that the writes are occurring in clusters that add up to 1MB. But I can't explain why it doesn't just write that amount in one call.

I then modified the program to take a second file name and use File::create() instead. In that case, I see the 1MB writes I expected:

write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
write(4, "@\0\5 \302\36\23\356\357^\25\16\313A\20#@\0\256B\377\360\nF\20\316\f\24@\340\r\23"..., 1048576) = 1048576
write(4, "\0\0\0\0\0\0\0\0\0\0\0\0 \2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\2\0\0"..., 1048576) = 1048576
write(4, "\234\363\234\363\234\363\234\363\234\363\234\363\234\363\234\363\234\363\234\363\234\363\234\363\234\363{\357{\357Z\353"..., 1048576) = 1048576
write(4, "\0\200\0\200\0\200\0\200\0\200\0\200\0\200\0\200\0\200\0\200\0\200\0\200\0\200\0\200\0\200\0\200"..., 1048576) = 1048576

So this appears to be some weird property of Stdout? Before I rabbit hole into that implementation, can someone explain what is going on here and how I can get the result I expect?

For reference, I'm using Rust 1.57.0 on Ubuntu 20.04 x86_64.

I've discovered that this has to do with Stdout's line buffering. It seems there are no good, portable workarounds.

Here's an issue about it.

If you can't find a crate but want to do it yourself, I'd probably start by looking at ripgrep as linked in that issue. BurntSushi is a team member and makes great stuff.

Thanks. I've now found several PRs and projects related to this. Unfortunately none of them have been moving forward. (Similarly with the workarounds I have to do to read fixed-length chunks and handle EOF properly, something libc's fread() does trivially.)

What really boggles me here is the why in this specific case. The point of line buffering is to insure that when a newline is written the output goes out immediately. Since my own BufWriter is passing down large chunks and bypassing Stdout's internal buffer (mostly) anyway, just writing all the data immediately would meet this expectation. There's no need to split it up.

Now obviously fixing that behavior wouldn't speed up repeated small println!() using the standard buffer as cited in some PRs.

More comprehensively, it seems like std could use an equivalent to libc's setvbuf(), allowing one to change the buffering mode or the buffer size, without having to create an extra BufWriter like I did. I did see some work in this direction, creating a switchable buffer layer, but it seems to be stalled.

If your target is unix, have you tried the method with writing to file, but to the file /dev/stdout. Instead of using std::io::stdout.

On Unix I can also do:

+  let g: File = unsafe{std::os::unix::io::FromRawFd::from_raw_fd(1)};
+  let mut g = BufWriter::with_capacity(1024*1024, g);

But I'd prefer a portable method.

I agree it's not ideal. How exactly are you outputting incidentally? I see write_fmt uses write_all,[1] so it's somewhat surprising too. Hard to gaze through all the layers of abstraction in the implementaiton though.

  1. potentially -- the comment is odd on a trait, as implementors don't have to use write_all ↩︎

I'm using write_all() on binary blobs. Ain't got time for writing outputs loops unnecessarily.

Honestly, even just exposing the underlying File from Stdout would create a trivial workaround, one could just make a new BufWriter and forget about the Stdout buffer. I'm not sure whether there's some portability constraint that makes this difficult.. The fact that I'm passing an immutable File to BufWrtier suggests that there should be no issues with mutable borrowing.

Looking closer at your strace, yeah, that's probably as good as it will get with the current stdlib -- flush the line buffer, then write the blob, leading to intermingled short and long writes.

You could #[cfg] your way through it with generics or dyn Write, if you don't want to take the time to find a properly portable approach for every platform.