Associated type parameters with lifetimes and `self`

Hi everyone! I have a problem with GATs and lifetimes and hope to get from help :slight_smile:
I'm trying to implement a trait to read a stream in fixed-size chunks, as the following:

pub trait ChunkRead {
    type R<'s>: Read + Seek where Self: 's;

    fn read_chunk(&mut self, chunk_number: u64) -> io::Result<Self::R<'_>>;

    fn chunk_size(&self) -> u64;
}

And then write abstractions to go stream -> chunked stream and chunked stream -> stream

For the second case I currently have the following:

pub struct ChunkReader<'r, C, R> {
    chunks: &'r mut C,
    current_reader: Option<R>,
    current_chunk: u64,
    current_chunk_read: usize,
}

impl<'r, R, C> ChunkReader<'r, C, R> {
    fn new(chunks: &'r mut C) -> Self {
        Self {
            chunks,
            current_reader: None,
            current_chunk: 0,
            current_chunk_read: 0,
        }
    }
}

impl<'s, 'r: 's, R: Read + Seek, C: ChunkRead<R<'s> = R> + 's> Read for ChunkReader<'r, C, R> {
    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
        let reader = match self.current_reader.as_mut() {
            None => {
                self.current_reader = Some(self.chunks.read_chunk(self.current_chunk)?);
                self.current_reader.as_mut().unwrap()
            }
            Some(_) => {
                if self.current_chunk_read == self.chunks.chunk_size() as usize {
                    let new_reader = self.chunks.read_chunk(self.current_chunk + 1);

                    match new_reader {
                        Ok(new_reader) => {
                            self.current_chunk += 1;
                            self.current_chunk_read = 0;
                            self.current_reader = Some(new_reader);
                        }
                        Err(e) => {
                            return if e.kind() == ErrorKind::UnexpectedEof {
                                Ok(0)
                            } else {
                                Err(e)
                            }
                        }
                    }
                }

                self.current_reader.as_mut().unwrap()
            }
        };

        let read = reader.read(buf)?;
        self.current_chunk_read += read;

        Ok(read)
    }
}

(I'm not pasting the error here because is fairly large, here's a link to the playground)

In particular I'm not quite understanding why the compiler complains about lifetime '1 which is tied to self, but self.chunks should have lifetime 'r right?

For reference, here's a previous version of the same trait without GATs, however I found it necessary to introduce them when implementing the chunked stream -> stream use case

Thank you for your help!

fn read_chunk<'r>(&mut self, chunk_number: u64) -> io::Result<Self::R<'r>>;

Can't comment (not spent time) if there is something else wrong but this is a fix to get it to compile.

Thanks!
This way it compiles but the 'r lifetime is "arbitrary" so I cannot return anything borrowed from self in other implementations, which I'd like to do (something like (this))

Is there any way to make this work?
I tried to add another lifetime 's like

fn read_chunk<'r, 's: 'r>(&'s mut self, chunk_number: u64) -> io::Result<Self::R<'r>>;

But I got a similar error as in the implementation without the additional lifetimes (this)

self.current_reader = Some(self.chunks.read_chunk(self.current_chunk)?);

This can only return a <C as ChunkRead>::R<'this_or_less>, where the read method was called with &'this mut self. That's not a lifetime (and thus not a type) nameable by Self, so it's impossible for R = <C as ChunkRead>::R<'this_or_less>.

(Why? Because you can't reborrow a &'long mut C through a &'short mut ChunkReader<'long, ..>.)

If it's possible to use shared references, that may work (as you can copy the &'r C out from beneath the &'this mut self). Otherwise you'd probably need some sort of borrow-splitting mechanism.

After some pondering, I have another suggestion. Instead of having both &'r mut C and C::R<'r> fields at once, perhaps there's a way to have either one or the other.

#[derive(Default)]
enum ChunkState<'r, C: ChunkRead> {
    Pending(&'r mut C),
    Current(C::R<'r>),
    #[default]
    Poisoned,
}

In your implementations, the Poisoned variant allows you to temporarily remove the state from beneath a &mut.

pub struct ChunkReader<'r, C: ChunkRead> {
    state: ChunkState<'r, C>,
    current_chunk: u64,
    current_chunk_read: usize,
}

// ...
fn foo<C: ChunkRead>(cr: &mut ChunkReader<'_, C>) {
    let state = std::mem::take(&mut cr.state);
}

(Alternatively you could remove the Poisoned variant and have an Option<ChunkState<..>>).

Then you could match on state, do what you need to do, and replace state with some new non-Poison variant.

The ability to temporarily remove the ChunkState from behind &mut self is important here -- that's how you avoid trying to get a &'long mut through a &'short mut self. If you panic or the like, you'll leave your state field in the Poisoned variant (by unwinding before you can replace the state field).

The final missing piece of the puzzle is you need some way to go from C::R<'r> back to &'r mut C in order to go from Current to Pending.


And having written that up, another possibility occured to me: Instead you could have a way to go from one C::R<'r> to the subsequent C::R<'r> directly... in which case you could perhaps get rid of the Pending case altogether and just have an Option<C::R<'r>> in ChunkReader<..>.


In order to explain why I made that suggestion, consider this contrived data structure:

struct S<'origin: 'chunk, 'chunk> {
    a: &'origin mut String,
    b: (&'chunk mut str, Range<usize>)
}

fn example<'s>(s: &'s mut S<'_, '_>) -> (&'s mut str, &'s mut str) {
    (s.a, s.b.0)
}

This compiles, which means b can't be pointing at the same str data as a if we can call example -- because example returns active &mut _ to the str data from both a and b. If they aliased (pointed at the same str data), that would be UB.[1]

So even though this isn't technically self-referencial in the typical way,[2] it has the same sort of problems -- you won't be able to actually create S with b borrowing through a and still be able to call example. That's the compiler saving you from UB.

(I gave them different lifetimes with a bound just to illustrate that it doesn't actually help.)

In order to have both fields at once, there would have to be a way to split the borrow so that they don't overlap. That can make sense for something like a &mut [..] slice, but not for a Read implementor generally.

That's why your current design probably won't work with &muts, and why I suggest trying to have one or the other of &'r mut C or C::R<'r> at a time instead of both at once.


  1. Despite the name, &mut are exclusive references, not just mutable. ↩ī¸Ž

  2. it doesn't own the String ↩ī¸Ž

Wow @quinedot thank you very much for the thorough explanation!
I actually stumbled upon your mutable iterator example that made me think that with my current design I could get a double &mut reference if the borrow checker didn't disallow that.

For now I found a workaround as in the actual code I needed something less powerful than ChunkReader (i.e. I just needed to Read::chain just two "chunks"), but I'll try to explore your suggestions in the weekend and update this thread if I find an interesting design.

Thank you again for the kind help! :slight_smile:

1 Like