(Not a hundred percent sure if this is a misunderstanding of tokio::net::TcpStream
, Tcp in general or the tc
utility)
I'm implementing a two-party protocol using tokio::net::TcpStream
and tokio_serde::SymmetricallyFramed
. These are used to build an Transport
struct that implements SinkExt
and StreamExt
for some serde
item. I've set nodelay()
on the TcpStream.
On a high level the protocol looks like this:
let mut response = initial_value;
loop {
let msg = compute_message(response); // fast and messages are small, usually less than 100 bytes
transport.send(msg).await;
let response = transport.next().await;
if done_condition {
break;
}
}
where transport
is the struct which can send and receive msg structs by serializing/deserializing them with bincode and sending them via a TcpStream
.
Now I want to benchmark this protocol using the linux traffic control (tc) utility. I have to hosts and on each execute the following:
#!/bin/sh
/sbin/tc qdisc del dev enp23s0 root
/sbin/tc qdisc add dev enp23s0 root handle 1:0 htb default 10
/sbin/tc class add dev enp23s0 parent 1:0 classid 1:10 htb rate 100Mbit
/sbin/tc qdisc add dev enp23s0 parent 1:10 handle 10:0 netem delay 50ms 3ms 25% distribution normal
which adds a delay of 50ms to the sending for both parties.
As expected, the ping output shows a round trip time of approx. 100 ms:
> ping host_b
64 bytes from <ommited>: icmp_seq=1 ttl=64 time=98.9 ms
64 bytes from <ommited>: icmp_seq=2 ttl=64 time=101 ms
64 bytes from <ommited>: icmp_seq=3 ttl=64 time=96.1 ms
64 bytes from <ommited>: icmp_seq=4 ttl=64 time=98.6 ms
64 bytes from <ommited>: icmp_seq=5 ttl=64 time=94.8 ms
So now to the confusing part:
I would expect my protocol to finish in #loop iterations * RTT / 2
time. Both parties should be able to send their messages at the same time and receive approx. 50 ms later the message of the other party. So each loop iteration should take RTT / 2
time.
However, this is not what I'm observing. When measuring the time of the send
and next
operations, send completes almost immediately, while next
takes around 100 ms. So it seems like each loop operation takes the time of a roundtrip.
I guess this is not an issue with my usage of tokio's TcpStream, but rather my understanding of Tcp or my usage of tc
but I can't figure it out
I hope someone can help me
Edit:
Looking at the logs, I noticed something weird:
Logs - Party 0
{"timestamp":"2022-03-24T19:16:05.811818Z","fields":{"message":"Layer send","layer_time":"0"}}
{"timestamp":"2022-03-24T19:16:05.811890Z","fields":{"message":"Layer recv","layer_time":"0"}}
{"timestamp":"2022-03-24T19:16:05.811967Z","fields":{"message":"Layer send","layer_time":"0"}}
{"timestamp":"2022-03-24T19:16:05.910809Z","fields":{"message":"Layer recv","layer_time":"98"}}
{"timestamp":"2022-03-24T19:16:05.910968Z","fields":{"message":"Layer send","layer_time":"0"}}
{"timestamp":"2022-03-24T19:16:06.008920Z","fields":{"message":"Layer recv","layer_time":"97"}}
it continues in the 0
and ~100
pattern.
Logs - Party 1
{"timestamp":"2022-03-24T19:16:05.760583Z","fields":{"message":"Layer send","layer_time":"0"}}
{"timestamp":"2022-03-24T19:16:05.858636Z","fields":{"message":"Layer recv","layer_time":"98"}}
{"timestamp":"2022-03-24T19:16:05.858804Z","fields":{"message":"Layer send","layer_time":"0"}}
{"timestamp":"2022-03-24T19:16:05.952837Z","fields":{"message":"Layer recv","layer_time":"94"}}
{"timestamp":"2022-03-24T19:16:05.952967Z","fields":{"message":"Layer send","layer_time":"0"}}
{"timestamp":"2022-03-24T19:16:06.052081Z","fields":{"message":"Layer recv","layer_time":"99"}}
it continues in the 0
and ~100
pattern.
It seems like P1 is sending the first message significantly before P0. In the beginning, P0 sends their message, immediately receives the message by P1 and directly sends the next message of the second iteration of the loop. What confuses me, is that P1 receives the second message nearly 100 ms after the first, although they were sent at virtually the same time!
This would maybe make sense if there was message reordering involved, but from what I can tell, that should not be the case here.