I was trying to write a program that spawns another program in the background and read its output.
Pseudocode:
Reader:
spawn 'Program A' in the background, set its stdout to piped.
loop {
read 5 bytes from Program A's stdout
print it
}
Program A:
print "Hello"
loop {
read a line from stdin
print it
}
I expected this to print "Hello" and hang, because I am not inputting anything to Progam A's stdin.
But it gives no output at all.
If I comment out the part that reads input in Program A, everything works as expected.
What is causing this block? Is this a deadlock?
Is this block avoidable?
Full code:
// reader.rs
use std::io::Read;
use std::process;
fn main() {
let cmdline_args: Vec<String> = std::env::args().collect();
let mut path = std::path::Path::new(&cmdline_args[1]).to_path_buf();
let args = &cmdline_args[2..];
let mut proc = process::Command::new(&path)
.args(args)
.stdout(process::Stdio::piped())
.spawn()
.unwrap();
let mut stdout = proc.stdout.take().expect("Failed to take stdout.");
let mut s = [0u8; 5];
loop {
dbg!(stdout.read(&mut s));
let vec: Vec<u8> = Vec::from(&s as &[u8]);
println!("Got: {}", String::from_utf8(vec).unwrap());
}
}
// prg_a.c
#include <stdio.h>
char buf[1024];
int main()
{
printf("Hello\n");
while (1)
{
// If I comment this line, everything works as expected.
fgets(buf, 10, stdin);
printf("You entered: %s\n", buf);
}
}
You are searching difference in wrong direction. If you comment out bothfgets and printf you should see the same behaviour of reader not printing anything. If you do not comment printf too then constant printing of You entered: … will overflow stdout buffer which will make reader actually print something. The key point here is that libc and Rust have different ideas on how stdout should be buffered, so printf("Hello: \n"); will just stuff data into libc buffer attached to stdout object to be sent sometime later, but println!("Hello"); will actually put Hello\n into stdout.
To fix the issue add fflush(stdout); after printf("Hello: \n");. BTW, why C code has colon and space after Hello and Rust does not?
If I understand that correctly buffering behaviour in libc depends on whether or not stdout is a terminal: while in Rust that is always line buffered libc makes it line buffered when output is terminal and uses fixed-size buffer when output is not.
I just found out that by using stdbuf --output=L <command name> <args..> , I can change the output of any program to line buffered mode.
See man stdbuf for more details. stdbuf is part of coreutils.
stdbuf and any other kind of manipulation performed by reader.rs is going to work only as long as prg_a somehow allows it. For instance, glibc and some other libc’s can be configured, but that same stdbuf is not going to do anything to change your Rust program which prints something too: LineBuffer usage is hardcoded there. Also note
NOTE: If COMMAND adjusts the buffering of its standard streams ('tee' does for e.g.) then that will override corresponding settings changed by 'stdbuf'. Also some filters (like 'dd' and 'cat' etc.) don't use streams for I/O, and are thus unaffected by 'stdbuf' settings.
, Rust situation with LineBuffer wrapping structure which is a marker for using raw file descriptors counts as not using streams.
Okay. Assuming the command uses libc functions, like printf for output, is there anything like setbuf that can be called from Rust to change buffering mode of ChildStdout?
You can just reimplement stdbuf in Rust: it is open-source, rather small and all it does is setting some environment variables. So you may check which environment variables are set and set them in Rust.
Though after checking its source code (could have actually figured it out after reading man page if I paid enough attention) I must say it appears that “configuring libc” is not how stdbuf actually works: it instead preloads (tells ld-linux.so to preload) libstdbuf library (by setting LD_PRELOAD environment variable) and configures that which in turn configures libc streams upon actually loading. So you will either have to rewrite libstdbuf in Rust as well (note: it will and must be additional file, there is no sane way for it to be contained in reader program) or have libstdbuf as a dependency.
This also means that you can’t do anything to buffering mode without injecting some code into cmd_a process: if there was better method authors of stdbuf would have probably used it instead.
Note that to set environment variables you can use Command::envs method.
Hi, thanks for the link to the source code.
It looks like libstdbuf is calling setvbuf on stdin before main() is called.
Can we do the same by creating a pipe from reader.rs, setting both ends of it to line buffer mode and then passing the write end of the pipe to stdout() of Command?
I tried this but it doesn't seem to work:
let (read_pipe_fd, write_pipe_fd) = nix::unistd::pipe().expect("Failed to create pipe");
unsafe {
let f = libc::fdopen(read_pipe_fd as libc::c_int, std::ffi::CString::new("r").unwrap().as_ptr());
dbg!(libc::setvbuf(f, 0 as _, libc::_IOLBF, 0));
let f = libc::fdopen(write_pipe_fd as libc::c_int, std::ffi::CString::new("w").unwrap().as_ptr());
dbg!(libc::setvbuf(f, 0 as _, libc::_IOLBF, 0));
}
let mut proc = process::Command::new(&path)
.args(args)
.stdout(unsafe{process::Stdio::from_raw_fd(write_pipe_fd)})
.spawn()
.unwrap();
let mut stdout = unsafe{std::fs::File::from_raw_fd(read_pipe_fd)};
let mut s = [0u8; 5];
loop {
dbg!(stdout.read(&mut s));
let vec: Vec<u8> = Vec::from(&s as &[u8]);
println!("Got: {}", String::from_utf8(vec).unwrap());
}
I really want to learn more about how all these unix concepts like file, streams, pipes etc fit together.
Could you point me to some resource where I could learn these from?
It looks like libstdbuf is calling setvbuf on stdin before main() is called.
Can we do the same by creating a pipe from reader.rs, setting both ends of it to line buffer mode and then passing the write end of the pipe to stdout() of Command?
When creating a child process you normally do the following sequence:
Create a clone of the parent process. (Normally with fork().)
In a child possibly do some things like closing unneeded file descriptors.
In a child replace a current process image with a new process image. (Via exec*() function family.)
While it is true that you can do some things in stage 2 because after replacing process images some resources stay in the stay you left them, which happens to include file descriptors, this will not help you with buffering issue. All resources I know to stay live in the kernel. The very reason libc has any buffering on streams at all is to avoid having too many switches to the kernel context as they are expensive. So buffers are defined when libc is doing its own initialization, and that would happen again after the stage 3. And you really would not like it not happening because if you leave state from the parent process intact you will observe loads of hard-to-debug bugs due to software not expecting to be in particular state.
Injecting libstdbuf into the process makes its initialization function run after that third stage too. And because libstdbuf depends on libc itself libc initialization is forced to occur before libstdbuf initialization.
I really want to learn more about how all these unix concepts like file, streams, pipes etc fit together.
Could you point me to some resource where I could learn these from?
I do not actually know a resource, I usually either search things on the Internet at demand or remember something because I read an article somewhere (primary on https://habr.com, but it is mostly in Russian, they decided that they wanted to go worldwide and allow English articles something like this year), or because I had similar problem earlier.