How to properly define a trait that has a method that returns a generic type that references Self

Problem Description

I'm having trouble figuring out how to properly define a trait that has a method that returns a generic type that references Self.

Context

I'm implementing WebGraph. It's a data structure for compressing graphs to a bit-stream backend. This backend could be either a Vector, a File, or a MemoryMapped File.
To be able to parallelize some aspects, I need to be able to create multiple independent readers on the same backend. Also, I can have different types of readers, for example, a reader that read bits from the LSB to the MSB and one that read from the MSB to the LSB.

I want a generic implementation that can accept any of the 3 mentioned backends with the 2 mentioned readers so that they can be composed at compile time.

My Attempt

In this section, I'm gonna present a simplified version of what I'm trying to do.
An (useless) goal usage code could be:

// Instantiate a sample Backend
let backend = VecBackend((0..100).collect());

// create a new reader
let mut reader1 = backend.get_reader::<BackendReaderL2M>();

// the same reader but offsetted by one bit
let mut reader2 = backend.get_reader::<BackendReaderL2M>();
let _ = reader2.read_bit();

// print the first 10 bits
println!("{:?}", (0..10).map(|_|  if reader1.read_bit() { "1" } else { "0" }).collect::<Vec<_>>().join(""));
// print the bits from 1 to 11
println!("{:?}", (0..10).map(|_|  if reader2.read_bit() { "1" } else { "0" }).collect::<Vec<_>>().join(""));

To achieve this I defined two traits, one that describes a reader:

/// General trait for anything that can act as a stream of bits
/// this needs to be a trait because I could have both a reader that
/// read bits from the MSB to the LSB and one that does the opposite 
/// (like in this example below)
pub trait Reader {
    /// Read a single bit from the stream 
    /// (this is mutable because it seeks forward the stream by 1 bit) 
    fn read_bit(&mut self) -> bool;
}

and one for backends that can instantiate new readers:

/// General trait for Backends that can instantiate new readers
pub trait GetReader<ReaderType> 
where
    ReaderType: Reader
{
   /// Returns a new reader initialized at the start of the stream
    fn get_reader(&self) -> ReaderType;
}

This is the reader sample implementation:

/// A reader that can independently read from the backend.
pub struct BackendReaderL2M<'a>{
    /// The reference to the backend
    data: &'a Vec<usize>,
    /// Bit offset in the stream
    offset: usize,
}

impl<'a> Reader for BackendReaderL2M<'a> {
    /// Sample implementation (not important for the problem)
    /// Read bits from LSB to MSB in each word
    fn read_bit(&mut self) -> bool {
        const BITS_IN_WORD: usize = core::mem::size_of::<usize>();
        let current_word = self.data[self.offset / BITS_IN_WORD];
        let aligned_word = current_word >> (self.offset % BITS_IN_WORD)
        self.offset += 1;
        (aligned_word & 1) != 0
    }
}

and this is the sample VecBackend implementation:

/// Sample Backend that represents an in-RAM vector
pub struct VecBackend(Vec<usize>);

/// The problematic bit
impl<'a> GetReader<BackendReader<'a>> for VecBackend {
    fn get_reader(&self) -> ReaderType {
        BackendReader{
            data: &self.data,
            offset: 0,
    }
}

The problem

is that the trait definition implies that the type of get_reader is :

fn get_reader(&'1 VecBackend) -> BackendReaderL2M<'2>

but the implementation is:

fn get_reader(&'1 VecBackend) -> BackendReaderL2M<'1>

because &self.data will have the same lifetime of &self so '_.

What I Tried

So I would like to constraint the returned type to have the same lifetime as Self (or a lifetime bounded by it), something like:

pub trait GetReader<ReaderType> 
where
    ReaderType: Reader
{
   /// Returns a new reader initialized at the start of the stream
    fn get_reader(&'_ self) -> ReaderType + '_;
}

but it doesn't work.

I also tried to add a lifetime that's constrained to outlive Self but I didn't have any luck:

pub trait GetReader<'r, ReaderType> 
where
    ReaderType: Reader + 'r,
    Self: 'r,
{
   /// Returns a new reader initialized at the start of the stream
    fn get_reader(&self) -> ReaderType;
}

Conclusion

Thanks to anyone that might help.
It's my first post on the forum so I hope to have correctly followed the rules (sorry if I missed something!).

Well, this is not simple to do in its full generality. However, you can fix the immediate error by stating that the lifetime of any pointers in the return type must be tied to the lifetime of the &self reference. The syntax for that is:

pub trait GetReader<'a, ReaderType> 
where
    ReaderType: Reader + 'a
{
   /// Returns a new reader initialized at the start of the stream
    fn get_reader(&'a self) -> ReaderType;
}

Accordingly, after fixing some compiler errors related to mismatched type and field names, this compiles.

However, I believe this will be more restrictive than necessary for file-based backends, since it will unnecessarily tie lifetimes together. It's not an accident one rarely sees methods with an expicitly lifetime-annotated &self.

2 Likes

In this particular case, since you're dealing with shared borrows only, you can implement the trait for references to the underlying data structure as a way to carry the lifetime through.

impl<'a> GetReader<BackendReader<'a>> for &'a VecBackend {
    fn get_reader(&self) -> BackendReader<'a> {
        BackendReader{
            data: &self.0,
            offset: 0,
        }
    }
}

(Because you can turn a &'short &'long Thing into a &'long Thing, which is not the case with mutable borrows.)

You may have to assign borrows of the underlying backend to local variables to make this work elsewhere in your code (to avoid dropping temporary borrows).

2 Likes

Thank you! Yeah, which consequences has tying together these lifetimes?
From my (limited) understanding we are saying that the ReaderType must live exactly as long as that self reference. Is this correct?
Thanks again!

Thank you! This might work since I just want immutable references, what other limitations this approach might have?

I kinda understand the reference conversion you said, but I don't get why this won't work for the Type but only for its reference.

No, it means that it must not outlive self.

1 Like

The short version is that you need something at the implementation level to attach a lifetime to, so you can constrain the lifetime of the returned ReaderType parameter. It could be a lifetime parameter of the trait, or it could be a lifetime that's part of the implementing type. VecBackend doesn't have a lifetime as part of its type, but &VecBackend does.

Much more long-winded walkthrough.

This:

impl<'a> GetReader<BackendReader<'a>> for VecBackend {
    fn get_reader(&self) -> BackendReader<'a> { /* ... */ }
}

Is short for this:

impl<'a> GetReader<BackendReader<'a>> for VecBackend {
    fn get_reader<'b>(&'b self) -> BackendReader<'a> { /* ... */ }
}

And is saying "If you give me a &'b VecBackend, no matter how short (or long) 'b is, I can give you a BackendReader<'b> back, for any lifetime 'b, no matter how long (or short)". Or for a specific concrete example, it says that you can get a BackendReader<'static> out of a borrow as short as the function call itself.

let to_infinity: BackendReader<'static> = vec_backend.get_reader();

The only ways to actually do this would be to allocate and leak memory, or to return a reference to some static buffer.

But what you're really trying to say is "If you give me a &'b VecBackend, I can give you a BackendReader<'b> back (for any 'b)". Or maybe "If you give be a &'b VecBackend, I can give you a BackendReader<'a> back out, where 'a is no longer than 'b". But you can almost always get away with not being explicit about this variance. So what you need is a way to constrain the lifetime of your return value based on the &self that gets passed in.

One way is to name the lifetime at the trait level, as @H2CO3 suggested. But if we want to keep this signature that says we can always get an R unconstrained by the borrow of self:

trait GetReader<R> { fn get_reader(&self) -> R; }

Then the constraint is going to have to be part of the Self type -- the implementer -- itself. But VecBackend doesn't have a lifetime as part of its type; it imposes no meaningful constraint. However, &VecBackend does have a lifetime as part of its type. So then when we write this:

impl<'a> GetReader<BackendReader<'a>> for &'a VecBackend {
    fn get_reader(&self) -> BackendReader<'a> { /* ... */ }
}

We're saying "For every concrete 'a, if you give me a &'b &'a VecBackend of any lifetime 'b, I can give you back a BackendReader<'a>." And because you can copy the &'a VecBackend out from underneath the &'b self, we can fulfill this promise.

(The "any" in "any lifetime 'b" should really have an asterisk attached -- it's valid for any Well-Formed nested references. There's an implied bound here that 'a: 'b -- that you can't borrow something for longer than that something is valid. But it's not really a concern here.)


I think it's more awkward to use than it is limited. You could remove the awkwardness by changing the method to take self instead of &self (but maybe this isn't great for your other use cases, I'm not sure).

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.