Clean way to terminate a TcpStream::read

I have a rust program doing a read on a socket using TcpStream::read_exact and I want to provide the user a way to interrupt it so that some other task could be performed.

I'm thinking about using a signal handler that once invoked somehow would get the TcpStream::read_exact to abort and return some message so the subsequent code could detect the user abort condition and do something different than responding to the normal read_exact result.

Is there any clean way to force the TcpStream::read to stop without killing the thread, but instead make it return a value about the reason it was terminaled?

Not really. You'd have to use select to block on this socket, and some other that receives data from an interrupt (signalfd?), but that's a major hassle, and you're then half way to inventing an async runtime.

So the easiest option is to use async/await instead of blocking functions, and futures can be easily aborted.

Interesting.

So I know some day, probably soon, I need to take the deep dive into asynchronous programming.

I've got a fair amount of experience with Javascript and a bit with Node using that model. It's just I was hoping to exhaust what reasonable attempts there are here prior to engaging such a big paradigm shift into my rust dev world.

That said, and w/respect to your reply (and please forgive my naivete) I don't understand why I would need to put a "block" on the socket. Since read_exact goes away from me I was simply assuming that a block was going on. Seems I'll have to study what's going on there more too!

What I was hoping should be possible, was to simply have a signal interrupt handler (e.g. using ctrlc module). And when the user hit CTRL-C, have the handler issue a shutdown of the read socket. Wouldn't that cause a safe return from read_exact with an error? I have yet to try it out, but even if it worked I would be wanting to know how safe this approach was. Which is why I'm reaching out here before even trying it.

Again, thanks as always for the reply.

read_exact is supposed to have exclusive access to the TcpStream. I suppose you could extract a raw fd out of it, and independently "break" it in a system-specific way.

I disagree—Read is implemented for &TcpStream, and there is TcpStream::shutdown which takes &self, so there should be nothing wrong with shutting down the connection from a signal handler while doing read_exact on a borrow in the main "thread of execution".

1 Like

Here's an example that (I think) accomplishes what you're trying to do:

use std::io::prelude::*;
use std::net::{TcpStream, Shutdown};
use std::sync::Arc;

fn main() {
    let stream = TcpStream::connect("127.0.0.1:5555").unwrap();
    let stream = Arc::new(stream);
    let handle = Arc::clone(&stream);
    ctrlc::set_handler(move || {
        let _ = handle.shutdown(Shutdown::Read);
    }).unwrap();
    loop {
        let mut buf = [0; 100];
        if let Err(_) = (&*stream).read_exact(&mut buf) {
            eprintln!("got an error, exiting...");
            break;
        }
        eprintln!("got some bytes");
    }
}

Tested with nc -l -p 5555 </dev/random as the server. I'm using Arc here because ctrlc::set_handler requires a 'static closure.

Thank you for the suggestion. That looks like what I want. I just put it into my local test code and it compiles. I have more work to do before I can test it, but one thing I love about Rust is that in my experience once it compiles it's often ready to run!

One question. Can you please explain why the

(&*stream)

Patterns are required? I see that using the stream directly generates a "cannot be used as mutable" and somehow this pattern makes the error go away. It seems unituitive to me because &* seem to do opposite things. What exactly happens here?

Thank you so much for the help!

I tried your code and it works great.

For some reason it's Ctrl-C is ignored with my own code. And I'm doing some troubleshooting.

Could you expand a bit on why Arc is required for a 'static closure? Is that why the move is required?

The combination &* (or &mut *) is called a "reborrow". You can analyze it as a dereference * followed by a borrow &, or you can think of it as a unified operation that turns some kind of pointer—anything that implements Deref—into a reference. Here we have an Arc<TcpStream> that we want to turn into a &TcpStream, and Arc<T> implements Deref<Target = T>, so &* does the job. Now, normally the compiler's autoderef rules implicitly add as much *, &, and &mut as necessary to the receiver when we have an expression like stream.read_exact(...), but in this case we have to intervene manually, because read_exact comes from the Read trait, and Read is implemented for both TcpStream and &TcpStream. If we write plain stream.read_exact(...), autoderef will resolve it as

<TcpStream as Read>::read_exact(&mut *stream, ...) // error!

But we want the expression to resolve as

<&TcpStream as Read>::read_exact(&mut &*stream, ...) // okay

Which is what the manual reborrow accomplishes.

2 Likes

Arc<T> implements Deref<Target=T>, so *stream is a place expression for the TcpStream inside the Arc, and &*stream is thus a reference to the TcpStream inside the Arc. But Arc does not implement DerefMut (so you can't get a &mut TcpStream from the Arc).

Without the pattern, method resolution finds the impl Read for TcpStream which wants a &mut TcpStream that you cannot get out of the Arc. But what you really want is the impl Read for &TcpStream that takes a &mut &TcpStream. Method resolution can't find this from the Arc or even a TcpStream on its own, but if you give it a &TcpStream, it will find it.

1 Like

Thank you. That makes complete sense. I was thinking it had a lot to do with something

Arc<T>

implemented and triggered some conversation, but your explanation provides a lot of information not obvious to me.

Btw/ I spoke too soon. For some reason your example does not work for me. I never see the text "got an error, exiting..." -- and so it appears to be behaving as though the handler is not even being invoked, but instead the Ctrl-C is acting as it normally does.

I tested by commenting out the set_handler and the Ctrl-C acts the same:

use std::io::prelude::*;
#[allow(unused_imports)]
use std::net::{TcpStream, Shutdown};
use std::sync::Arc;

fn main() {
    let stream = TcpStream::connect("127.0.0.1:5555").unwrap();
    let stream = Arc::new(stream);
    /*
    let handle = Arc::clone(&stream);
    ctrlc::set_handler(move || {
        let _ = handle.shutdown(Shutdown::Read);
    }).unwrap();
    */
    loop {
        let mut buf = [0; 100];
        if let Err(_) = (&*stream).read_exact(&mut buf) {
            eprintln!("got an error, exiting...");
            break;
        }
        eprintln!("got some bytes");
    }
}

read_exact ignores EINTR and tries to continue reading even if a signal handler interrupted reading.

Sorry, I don't follow—how would that explain the behavior @CKalt is seeing? I certainly see the got an error, exiting... printed on my machine.

The 'static requirement of ctrlc::set_handler means that the closure you give it cannot hold temporary references to any local variables of the enclosing function (here, main). The version of the code without Arc looks like this:

    let stream = TcpStream::connect("127.0.0.1:5555").unwrap();
    ctrlc::set_handler(|| {
        let _ = stream.shutdown(Shutdown::Read);
    }).unwrap();

For this snippet, the compiler infers that the closure should capture stream by (shared) reference, and so it becomes non-'static. The general recipe for dealing with this is to put the shared resource into an Arc, which can be cloned to get multiple 'static/owning/non-temporary handles to the resource, each of which can provide a shared reference (via Deref) when needed. The move keyword is also needed here so that the compiler doesn't cause the closure to capture handle by reference, which would make it non-'static again. (But adding move to the non-Arc version of the code doesn't help because then the closure takes ownership of the whole TcpStream, preventing us from using it in the following loop.)

More conceptually: ctrlc::set_handler needs the closure you give it to be able to live for the rest of the program, because ^C could arrive at any time, including after the function containing the call to ctrlc::set_handler has returned. That means that all the data the closure requires to run must be owned by the closure, and not borrowed from the surrounding scope. On the other hand, we need still to be able to access the TcpStream after we've installed the signal handler. And since there's no way to borrow the stream from outside after it's been moved into the closure, shared ownership is the way to go.

I agree with @cole-miller.

The problem I seem to have is that the Cntrl-C interrupt handler is not being invoked at all, but rather whatever normally allows Ctrl-C to abort the program execution in place.

It seems to be working now. That's strange. I'll see what I did.

I figured out my problem. It's how I launch the program. If I use a simple "cargo run" it works fine.

When I use a script that I have that tees the output into a log file, that's when ctrlc module appears to have a problem.

Using this run script causes a problem. So that's probably a problem with how ctrlc deals with redirected stdio.

!/bin/sh
cargo run --release -- "${@:1}" 2>&1 | tee run.log
1 Like

Ah, yep, can confirm the same happens when I run the program that way. I think this is probably related to the shell's job control rather than stdio handling, but after looking at strace logs for a bit I still can't quite put my finger on the cause. Maybe the process is getting SIGPIPE and dying before SIGINT arrives?

When I read tee that was my guess. You can use head -n 0 instead of tee to induce a SIGPIPE without sending a break.

1 Like

So tee must be getting (and masking) the CTRL-C and then as you say that breaks the pipe sending a SIGPIPE to the rust program which must abort the execution. Just did a man on tee and low and behold, there's a nifty --ignore-interrupts option (-i).

Adding this to the script makes it work like a charm. Many thanks!

#!/bin/sh
cargo run --release -- "${@:1}" 2>&1 | tee -i run.log
2 Likes