Using two BytesMut to limit allocation size for network protocol?

I have a network protocol that sends messages frequently and I am processing them on a slow device, with even slower disk IO, using tokio. I want to log each received message to disk, compressed. Writing received messages to disk from the same task that processes them and generates replies sometimes spikes my latency for replies, which I have to avoid.

My idea was to read the data into a BytesMut, split() and freeze() the BytesMut to get a Bytes, send that to another thread for compression and logging, and re-use the original buffer. Unfortunately, it appears that due to the fact that messages are arriving very frequently, there are always a few Bytes handles still alive, which means the original BytesMut can never re-use the storage and keeps re-allocating. That seems a bit unnecessary, and I wondered if I couldn't work around that using the following design:

I have two BytesMut, an active one and a passive. I read into the active one, splitting Bytes from it as I do, until its capacity is exhausted. Before re-allocating I check if the passive BytesMut has any outstanding handles to it, and if not, I switch the two BytesMuts to read into the one that can reclaim its buffer. That seems to work quite well (but of course means I am using 2x the memory), and now I'm wondering if there are any better ideas how to handle such a situation?

If your disk can't keep up with the network then you'll eventually need to drop messages or run out of ram.

If you have a HDD instead of an SSD then you'll only want to write one thing at a time to avoid disk seeking.

I would suggest having one logging thread that writes to disk and use a channel to send the data (just plain Vecs) to it.

To get fancier you can limit the size by having the logger return the Vec via another channel, and fill that with N Vecs preallocated to the max size you'll care about. You can use timeouts on the channel calls to determine if the disk is overloaded and then either stop or drop messages.

To get more out of the logging thread, have it receive the raw data to log and do the processing elsewhere like the capture thread in separate processing and compression threads.

If you have many threads you might need a more flexible mpmc channel like from crossbeam or flume.