TcpStream `write` silently loses one message

Hi,

I originally reported this for tokio, but the same happens even with std::net::TcpStream.

The issue is that, given a TcpStream connected to a remote TCP server, when the server goes down, it is still possible to call the write methods on the stream without being notified of the failure.

This is the tested flow:

  1. Start a TCP server
  2. a TcpStream connects to the server
  3. the TcpStream sends 'message_1' -> the message is received by the server
  4. the TcpStream sends 'shutdown' -> the server receives the message and stops itself
  5. the TcpStream sends 'message_2' -> HERE THE ISSUE , in fact, both the write and the flush methods complete successfully, but the message is lost because the server is not available
  6. the TcpStream sends 'message_3' -> at this point the write fails, but it should have failed already on 'message_2'

I am very confused about this behaviour. How can both the TcpStream.write and the TcpStream.flush calls return Ok()?
Am I doing something wrong?

use std::net::{TcpListener, TcpStream};
use std::io::BufReader;
use std::io::BufRead;
use std::io::Write;
use std::time;
use std::thread;

const BASE_ADDRESS: &str = "127.0.0.1";

#[test]
fn std_should_not_lose_tcp_requests() {

    let port = 8080;
    let address = format!("{}:{}", BASE_ADDRESS, port);
    let address_clone = address.clone();
    
    // start a TCP server
    thread::spawn(move || start_server(address_clone));
    thread::sleep(time::Duration::new(1, 0));

    // Create a TcpStream connected to the server    
    let mut stream = TcpStream::connect(&address).unwrap();

    // send `message_1' -> Ok
    assert!(send_bytes(&mut stream, "message_1\n").is_ok());
    thread::sleep(time::Duration::new(1, 0));

    // send 'shutdown' -> Ok, the server is stopped
    assert!(send_bytes(&mut stream, "shutdown\n").is_ok());

    //  sleep alot to be really really really sure the server has time to go down
    thread::sleep(time::Duration::new(10, 0));

    // A new connection correctly fails as the server is not available
    assert!(TcpStream::connect(&address).is_err());

    // HERE THE ISSUE
    // THIS ASSERT FAILS: both the write and the flush complete successfully even if the server
    // is not available. Consequently, the message is silently lost.
    assert!(send_bytes(&mut stream, "message_2\n").is_err());

    // Sending a second message after the server went down correctly fails.
    // Anyway, we expected the failure to happen on the previous message.
    assert!(send_bytes(&mut stream, "message_3\n").is_err());
}

fn send_bytes<R: Write>(send: &mut R, message: &str) -> Result<(), std::io::Error> {
    send.write_all(message.as_bytes())?;
    send.flush()?;
    println!("message sent: {:?}", message);
    Ok(())
}

fn start_server(address: String) {
    let mut listener = TcpListener::bind(address).unwrap();

    println!("TCP listener ready on port: {}", listener.local_addr().unwrap().port());

    let (socket, _) = listener.accept().unwrap();
    let mut reader = BufReader::new(socket);

    loop {
        let mut response = String::new();
        reader.read_line(&mut response).unwrap();

        println!("received message: {}", response);

        if response.trim().eq("shutdown") {
            break;
        }
    }

    println!("Shutdown TCP server");
}

Welcome to the world of TCP.
Your socket is most likely in the FIN_WAIT_2 state (check with netstat or ss), at which point the server considers it closed, but the client doesn't, so it can still write. The timeout for that is rather high.

I don't know exactly why you don't get an error on the first try.

However you should not rely on a successful send/flush on the transport to assume that the message was actually received, interpreted and processed correctly by the server.
Instead your protocol should require the server to acknowledge the message explicitly.

There is so many places where buffering happens on the client and on the server side that you cannot rely solely on successful transmission on the transport level.

@jer
I performed the same test sending a message 60 minutes after the server went down and still the write completes successfully. How long can this timeout be?
Also, why message_3 always, correctly, returns error? If the socket was in FIN_WAIT_2 state, then all messages should return ok, but instead, only the first one returns ok and all the following ones return an error.

@troplin
I entirely agree with you that the server needs to acknowledge the message, but what I really don't understand is why the client write method returns Ok even after hours that the server is unavailable. This makes no sense to me.

The kernel will have a send buffer of unsent data and perform its own queueing algorithms. Calling write will only push data into this buffer and not necessarily emit a packet mainly for network utilization reasons, large packets have less overhead than many small packets. How it behaves for 'closed' connections or when the remote has timed out (and even what constitutes a timeout) will depend largely on the operating system so if you could name the one where you observe this behaviour?

There are system specific methods to query the current state of that buffer, for example SIOCOUTQ on Linux. In the link you will also find information on which timeouts exist, how you can query and set them etc.

3 Likes

@HeroicKatora
You're right. Thanks for the hints, I still don't understand the whole picture, but at least I clarified that there's no bug in my code or in rust itself.