Send WebSocket Message Without Creating Vec

I'm trying to tunnel some binary traffic over WebSockets. The traffic will come over a TCP socket and I will be shovelling it into a WebSocket in a loop using a single buffer for read and write as shown in the pseudocode below:

declare a buf
loop {
    tcp_socket.read(buf,...)
    web_socket_stream.send(buf, ....)
}

I was hoping to use tungstenite-tokio for this. However, the binary() method on a tungstenite Message takes an Into<Vec<u8>> which is terribly inefficient because it would allocate a new Vec<> every time. If so this would be a deal-breaker for me performance-wise. Or is there something obvious I'm missing? Please help.

This assertion is not universally true. If you supply a Vec<u8>, you would get the identity transformation, thus, no allocation would occur. If you pass &[u8], then yes, it would allocate. If you are ever unsure, you can check the source code implementations for Into<Vec<u8>> for any T (if implemented) on rustdocs.

2 Likes

Thanks @nologik.

You're right that Into will be a no-op for Vec<u8>, but I can't use a Vec<8> because the size of the data read from the TCP socket will vary and tungstenite's write_message() API does not take an argument that can specify bytes to write. It takes only a Message which contains only a Vec<u8>. That makes Vec<u8> a non-ideal choice unfortunately :frowning: .

It's good that you're taking performance to this level of consideration, but the cost of calling the underlying allocator is much cheaper than you might think (on the macroscopic level). For 99% of use cases, inputting your data into binary() will be good enough. Plus, the runtime needs the data heap-allocated to be able to process it on its own terms without lifetime limitations (in this case, when the async executor determines it's time to poll the future associated with sending the buffer to the TCP/TLS stream).

It's a common beginner phase to be OCD about optimization. Once you get to testing the program in real-world settings, you will realize most of that OCD tuning didn't make much of a difference. Being OCD does teach one alot about the inner workings though, so I won't discourage it; just realize you will tend to overestimate the degree of performance loss

What you're saying makes sense of course @nologik. I'm in fact pressing ahead with writing the code with tungstenite so that I can get an actual sense of performance. However, with infrastructural code like a network tunnel there is very little else going on in the program other than the data copy. Therefore if you have to allocate on every copy, it is likely to make a significant difference in my experience.

Anyway, looks like I don't have any alternative with tungstenite so I'll just try and measure the performance and see if it's acceptable. Otherwise, I could switch to the older library websockets because they do not seem to have the allocation problem.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.