How to know a TcpStream is closed in the other side?

let's say I have a mut stream which is TcpStream, how could I know that the other side is closed? I have tried if let Err(_) = stream.write(..), but it doesn't seems to work

Generally when a TCP stream is shut down normally, each side will call shutdown(Shutdown::Write) on their end of the stream. When you perform such a shut-down, the other side will see this as a call to read that returns 0 bytes. The tcp stream is fully shut down once both sides have sent such a signal.

As for receiving an error when you try to write, that can only happen if the stream is shut down in an abnormal manner. Normal shutdowns happen by receiving a read of length zero.

2 Likes

so that means I need to set up write or read deadline manually to know if the connection is closed, is there anyway to know it immediately? thank you

What do you mean? The call to read will return immediately when it is closed.

The issue, in my experience, is that one can open a TCP/IP connection between machines and as long as one is exchanging data regularly it is soon noticed if one end or other closes the connection or fails somehow.

BUT, when a connection is open with no traffic flowing, one can yank out the ethernet cable or otherwise break the network connection and neither end will notice. At least not for a long time.

For this reason protocols have "heart beat" messages to check the link is still working. Perhaps TCP/IP has some kind of "keep alive" as well, I don't know.

The Ethernet standard specifies that the network card shall raise a LINK_STATUS_CHANGE event when the cable is unplugged or plugged in. Unfortunately, most operating systems ignore this event. TCP deals with this fact by having a hard-wired timeout. Since the TCP standard doesn't specify a timeout, each implementation has its own. In one system I used in ancient times, that value was 20 minutes. (We patched the kernel to get a more reasonable value.) In the one I'm using today, it's 45 seconds.

Heartbeats are a common approach to determining liveness, and they work well for small systems. However, problems show up at scale. Network delays and slow processors can trigger timeouts when the other side is really alive, and the heartbeat messages can interfere with application traffic.

This brings back memories. What happens when the disconnect is not on your local ethernet cable? Far away behind some switches, routers or a wireless signal blockage?

This also bring back memories. Except wherever the timeout was it was much longer.

I guess there is no way to standardize this 'liveness" problem and timeouts that works well in all scenarios.

There are actually two timeouts. In the system I'm using today, the machine responds in about 4-5 seconds when the cable is unplugged. If we unplug a machine on the other side of a switch, it takes 45 seconds to notice that the destination is unreachable. I guess that's to give the switch time to find another path. If the two machines are connected directly to each other, they still wait 4-5 seconds when the cable is unplugged.

Liveness in asynchronous distributed systems is an unsolved problem. In fact, Fisher, Lynch, and Patterson proved that the failure of a single node can prevent reaching consensus. Fortunately, the heuristics we use work well enough.

Hmm...

I thought Leslie Lamport showed that a distributed system of four nodes could tolerate one Byzantine faulty node/link: https://people.eecs.berkeley.edu/~luca/cs174/byzantine.pdf

And in general a system to handle n Byzantine faulty nodes/links, could be built from 3*n + 1 fully connected nodes.

[Moderator note: This is getting off-topic. Please be careful about derailing concrete requests for help into more abstract or meandering discussions; it makes the forum less approachable to new users with simple questions.]

That's a protocol error, not a potentially dead node. The problem is that without a reliable failure detector, you don't know how long to wait for a response. If you don't wait long enough you get the double leader problem.

@mbrubeck: Thanks for noticing. This will be my last post on this topic.

There's some related discussion of checking for closed TCP streams in Rust here:

I appreciate that it's time to wind this up.

So my last comment here is: That is not the way I read the Byzantine General's paper. It specifically talks of dead/deceptive nodes. And how they are equivalent to dead/deceptive links.

I never learnt or used tcp network, so I thought after closing it will cause a error when try to read or write it, later I found out you are correct, thank you again.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.