It looks like no. I tried flushing half a character at a time, and this didn't cause any problems. Only flushing an invalid byte caused issues.
The source for writing to stdout in Windows is here: rust/library/std/src/sys/stdio/windows.rs at 3d8c1c1fc077d04658de63261d8ce2903546db13 · rust-lang/rust · GitHub
fn write_console_utf16(
data: &[u8],
incomplete_utf8: &mut IncompleteUtf8,
handle: c::HANDLE,
) -> io::Result<usize> {
if incomplete_utf8.len > 0 {
assert!(
incomplete_utf8.len < 4,
"Unexpected number of bytes for incomplete UTF-8 codepoint."
);
if data[0] >> 6 != 0b10 {
// not a continuation byte - reject
incomplete_utf8.len = 0;
return Err(io::const_error!(
io::ErrorKind::InvalidData,
"Windows stdio in console mode does not support writing non-UTF-8 byte sequences",
));
}
incomplete_utf8.bytes[incomplete_utf8.len as usize] = data[0];
incomplete_utf8.len += 1;
let char_width = utf8_char_width(incomplete_utf8.bytes[0]);
if (incomplete_utf8.len as usize) < char_width {
// more bytes needed
return Ok(1);
}
let s = str::from_utf8(&incomplete_utf8.bytes[0..incomplete_utf8.len as usize]);
incomplete_utf8.len = 0;
match s {
Ok(s) => {
assert_eq!(char_width, s.len());
let written = write_valid_utf8_to_console(handle, s)?;
assert_eq!(written, s.len()); // guaranteed by write_valid_utf8_to_console() for single codepoint writes
return Ok(1);
}
Err(_) => {
return Err(io::const_error!(
io::ErrorKind::InvalidData,
"Windows stdio in console mode does not support writing non-UTF-8 byte sequences",
));
}
}
}
// As the console is meant for presenting text, we assume bytes of `data` are encoded as UTF-8,
// which needs to be encoded as UTF-16.
//
// If the data is not valid UTF-8 we write out as many bytes as are valid.
// If the first byte is invalid it is either first byte of a multi-byte sequence but the
// provided byte slice is too short or it is the first byte of an invalid multi-byte sequence.
let len = cmp::min(data.len(), MAX_BUFFER_SIZE / 2);
let utf8 = match str::from_utf8(&data[..len]) {
Ok(s) => s,
Err(ref e) if e.valid_up_to() == 0 => {
let first_byte_char_width = utf8_char_width(data[0]);
if first_byte_char_width > 1 && data.len() < first_byte_char_width {
incomplete_utf8.bytes[0] = data[0];
incomplete_utf8.len = 1;
return Ok(1);
} else {
return Err(io::const_error!(
io::ErrorKind::InvalidData,
"Windows stdio in console mode does not support writing non-UTF-8 byte sequences",
));
}
}
Err(e) => str::from_utf8(&data[..e.valid_up_to()]).unwrap(),
};
write_valid_utf8_to_console(handle, utf8)
}
There is a process-global IncompleteUtf8 that stores any partial UTF-8 characters. Any full characters are written normally (lines 170, 171, and 185), which puts any partial characters at the beginning of the next write. If the entire slice is a partial character, the first character is stored in the IncompleteUtf8 and the function returns without anything else happening (lines 172-184). When you write again, this function appends bytes from the new write into the IncompleteUtf8 one at a time until either the character is complete or an invalid byte is found. If the character is complete, it is written to the real stdout as UTF-16. If an invalid byte is found, an error is returned. Either way, IncompleteUtf8 is reset.
I found one way to make this happen somewhat accidentally: If you're writing to unlocked stdout from multiple threads, they may race flushes, causing an error if the flush is in the middle of a character. But this is going to look bad on any OS.