Design problem: minimizing allocation/copying while parsing for a Tokio Decoder

djc · June 25, 2017, 7:29pm

I'm working on an IMAP implementation based on Tokio. However, I cannot figure out a good solution to this problem. I have a tokio_io::codec::Decoder implementation, like this:

impl<'a> Decoder for ImapCodec {
    type Item = ResponseData;
    type Error = io::Error;
    fn decode(&mut self, buf: &mut BytesMut)
             -> Result<Option<Self::Item>, io::Error> {
        ...
    }
}

I also have a parser that takes a byte slice and returns a Response<'a> representation that keeps a bunch of pointers into the byte slice, so that I can minimize allocation and copying:

fn parse(msg: &'a [u8]) -> Response<'a> {
    ...
}

The parser needs to run before I can figure out how long the underlying message is. After the parser is done, I'd like to put the raw contents together with the Response representation together into an object that I can return to higher layers of the protocol implementation.

The BytesMut can be split, but I cannot explain to the compiler that the bytes that are kept alive by the new BytesMut are actually the same as the bytes in the old BytesMut that the Response is pointing to. And of course I cannot split the BytesMut before parsing because I don't yet know which part of the BytesMut I need. After parsing I could split and then parse again, but seems inefficient/wasteful.

Any suggestions?

vitalyd · June 26, 2017, 2:46am

Not sure I fully grok this. BytesMut is essentially a ref counted pointer into some storage. It doesn't have any lifetime parameters associated with it. So, it's unclear (to me at least) what the "explanation" is that you're trying to achieve. Could you expand on that a bit?

djc · June 26, 2017, 8:00am

No, but the Response struct I have is lifetime-bound to the BytesMut because it has pointers into the BytesMut's storage. If I then split the BytesMut, rustc doesn't understand that the old BytesMut's storage is the same as the new BytesMut's storage, so it thinks that the Response should no longer be allowed to live.

Does that make more sense?

vitalyd · June 26, 2017, 10:44am

Does the Response need references into the BytesMut storage? I'd imagine storing just the BytesMut value would be easier.

djc · June 26, 2017, 11:44am

The Response structure is a somewhat-nested enum which can have several different pointers into the storage. And even if I'd store the BytesMut directly, it wouldn't solve my problem as I'd still want to split the BytesMut after parsing.

vitalyd · June 26, 2017, 1:35pm

Is it possible for a parser to return just the length of the message and whatever other positional info needed for a real parse (and split) to occur later? Sorry, it's a bit hard to put the entire picture together based on the info you've provided. It might help if you'd include the full code in question, or some minimal concrete code that demonstrates the issues.

djc · June 27, 2017, 7:15am

You can see an attempt at getting this working here:

https://github.com/djc/tokio-imap/commit/7e0fe600d92525bbd1a6f7f9f1cf91a8b95428a2

Topic		Replies	Views
Tokio, partial reads and BytesMut help	6	1100	April 12, 2020
Using two BytesMut to limit allocation size for network protocol? help	1	123	April 2, 2024
Implementing a binary protocol using tokio help	6	758	October 28, 2022
Using tokio channel to protect a resource (aka help with types and lifetimes still) help	4	571	January 9, 2022
Where to save request header info on Codec with tokio_proto?	4	1002	January 12, 2023

Design problem: minimizing allocation/copying while parsing for a Tokio Decoder

Related Topics