Implementing an Iterator that iterates over references to another object

I'm trying to make an iterator that attaches a reference to another object to each returned item.

The following code compiles just fine:

/// Holds row values and a reference to the column metadata so we it knows the column names and types:
pub struct TypedRow<'a> {
    pub columns: &'a [ColumnSpec],
    pub values: Vec<Value>,
}

impl TypedRow<'_> {
    pub fn new(row: Row, columns: &[ColumnSpec]) -> TypedRow {
        TypedRow {
            columns,
            values: row.values,
        }
    }
}

/// Iterates over rows, attaching the reference to metadata on the fly:
pub struct RowCursor {
    columns: Vec<ColumnSpec>,
    row_iter: <Vec<Row> as IntoIterator>::IntoIter,
}

impl RowCursor {
    pub fn next_row(&mut self) -> Option<TypedRow> {
        self.row_iter.next().map(move |r| { TypedRow::new(r, &self.columns) })
    }
}

However, this is not an Iterator, so many useful stuff from iterators is not available.

So next I wanted to add a wrapper that would make an iterator for the RowCursor.
I know it must be a separate struct and I can't directly make RowCursor implement Iterator, because it returns values referencing self. Streaming iterator doesn't work here either, because there is moving involved here as well (I'm not returning only a reference to itself).

Unexpectedly, I've run into lifetime problems:

pub struct RowIterator<'a> {              // that should make the inner RowCursor live at least as long as the iterator, right? 
    cursor: &'a mut RowCursor
}

impl<'a> Iterator for RowIterator<'a> {
    type Item = TypedRow<'a>;

    fn next(&mut self) -> Option<TypedRow> {
        self.cursor.next_row()             // I thought it should be fine, because the TypedRow is attached to the inner RowCursor
    }
}
error[E0495]: cannot infer an appropriate lifetime for lifetime parameter in generic type due to conflicting requirements
   --> stargate-grpc/src/result.rs:146:5
    |
146 |     fn next(&mut self) -> Option<TypedRow> {
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
note: first, the lifetime cannot outlive the anonymous lifetime defined on the method body at 146:13...
   --> stargate-grpc/src/result.rs:146:13
    |
146 |     fn next(&mut self) -> Option<TypedRow> {
    |             ^^^^^^^^^
note: ...so that the method type is compatible with trait
   --> stargate-grpc/src/result.rs:146:5
    |
146 |     fn next(&mut self) -> Option<TypedRow> {
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    = note: expected `fn(&mut RowIterator<'a>) -> std::option::Option<TypedRow<'_>>`
               found `fn(&mut RowIterator<'a>) -> std::option::Option<TypedRow<'_>>`
note: but, the lifetime must be valid for the lifetime `'a` as defined on the impl at 143:6...
   --> stargate-grpc/src/result.rs:143:6
    |
143 | impl<'a> Iterator for RowIterator<'a> {
    |      ^^
note: ...so that the types are compatible
   --> stargate-grpc/src/result.rs:146:5
    |
146 |     fn next(&mut self) -> Option<TypedRow> {
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    = note: expected `Iterator`
               found `Iterator`

I guess the prioblem is that self.cursor lifetime somehow gets shortened to the lifetime of &self here and not 'a (which might be longer). Is there a way to avoid it?

However, if I try to change &mut self with &'a mut self I get that my signature is incompatible with Iterator trait.

You can only call 'next_row' once while holding onto the return value. Iterators need to be repeatable.
One solution is move 'Vec<ColumnSpec>' out of 'RowCursor' and have it reference it.

You can only call 'next_row' once while holding onto the return value.

Why is that? This is surprising to me.

Because the TypedRow<'_> it returns borrows from *self. It's easy to overlook this because the <'_> is optional.

Use #![warn(rust_2018_idioms)] at the top of main.rs or lib.rs to be warned about omitting lifetime parameters from structs.

1 Like

What you've implemented with next_row is "lending iterator" (sometimes referred to as a "streaming iterator", but today that implies async instead, so be careful). It's an iterator where the returned items are borrows with a lifetime tied to each next call.

Below are your signatures written out more explicitly to hopefully make the lending nature less surprising. Note how the return is part of the borrow in next_row but is independent of the borrow in next ('a isn't tied to 'r).

// vvv: impl RowCursor { pub
fn next_row<'r>(&'r mut self) -> Option<TypedRow<'r>> { /* ... */ }
fn next    <'r>(&'r mut self) -> Option<TypedRow<'a>> { /* ... */ }
// ^^^: impl<'a> Iterator for RowIterator<'a> {

You're missing out because it's not a trait implementation. You can't implement the standard Iterator due to the lending nature, as others said. And having a lending iterator trait requires GATs. GATs aren't stable yet but are being actively worked on. Lending iterators are a main motivator, so hopefully soon after GAT, we figure out and stabilize a standard LendingIterator trait.

1 Like

Hold on, it is definitely possible to implement an iterator returning references to another object. This is how standard iter() works for collections. I was trying to avoid lending / streaming iterator here by following the same pattern as standard collection iterators by placing all the state/data in external object (RowCursor) not in the iterator itself. This way I have a proper lifetime for Item.

It looks like the problem is really the mutability part in the RowCursor - that's the only part different from normal Iterator over a collection. If I move the underlying mutable Iterator<Item=Row> into my RowIterator and make RowCursor contain only the immutable column references - the compiler accepts it.

I guess that if a function takes &'a mut self and returns anything that has &'a references (even immutable) then the mutable borrow of self extends for 'a. Is it correct? I was not expecting that because the actual mutable borrow of self is needed only for the lifetime of the call, and it could finish earlier because the returned item only needs a shared borrow on self.

Even though I already found a simpler solution and I don't need this now, I'm curious how and why the existence of mut on next_row leads to the lifetime conflict and the compiler doesn't mention anything about mut.

My understanding is that, when you use the same lifetime variable in the function input and output you are telling the compiler that the two items have the same lifetime. When one of the items has an immutable lifetime, it can't shrink or grow. Therefore that unalterable lifetime infects the other item, because you have demanded that the compiler use the same lifetime for both of them.

Yes, I can see the reasoning here. RowIterator doesn't own the contents (it owns a borrow), so it's
not a lending iterator in that sense. But RowCursor does follow the lending iterator pattern (as per the signature of next_row), and your original implementation was trying to "wrap up" that lending iterator into Iterator, which isn't possible in the general case.

Given your response, a lending iterator may not be what you want anyway (maybe you want to be able to collect the results of the iteration, say).

Yes, that is correct. It is more accurate to call a &mut an exclusive borrow rather than a mutable borrow, and the exclusive borrow extends for as long as the (borrowed) return value lives. I agree that this under-documented. It would be backwards-incompatible to change, but perhaps some day there will be a way to relax it.

But I don't think some way to create the canonical relaxation would help you on it's own, here. (By canonical relaxation I mean: self is borrowed mutably for the extent of the call but borrowed in a shared manner for the lifetime of the returned value.) You still wouldn't be able to call next_row again until the returned value expires, because you can't get a new exclusive borrow of self while any shared borrows of self exist, either.

For example, what if in next_back you said self.columns = Vec::new()? Previously returned TypedRows would dangle. You and I know that you're not going to do this, but there's nothing in the signature of next_back that precludes this (and the signature is the contract).

You really do need some way to separate the borrow of the Vec from the borrow of the iterator in order to make RowCursor do what you wish. The canonical relaxation is more for "let me immutably use the object while the returned borrow is still outstanding".

I'm trying to hone in on your expectations, but I'm not sure I'm successfully doing so. I'm not sure exactly you mean by "normal Iterator" for example, do you have a specific example in mind? (slice::Iter and slice::IterMut use pointers and unsafe, so they're not a direct parallel.)

Anyway, I suppose the other approach you found is something like this?

pub struct RowIterator<'a> {
    columns: &'a [ColumnSpec],
    row_iter: <Vec<Row> as IntoIterator>::IntoIter,
}

impl<'a> Iterator for RowIterator<'a> {
    type Item = TypedRow<'a>;

    fn next(&mut self) -> Option<TypedRow<'a>> {
        let row = self.row_iter.next()?;
        Some(TypedRow::new(row, self.columns))
        //                      ^^^^^^^^^^^^ must be &'a [ColumnSpec]
    }
}

So how are you managing to return something containing a &'a [ColumnSpec] when your owned version is "trapped" behind a shorter-borrowed &mut Self? self.columns is creating a copy, because shared references implement Copy. You're copying out from behind self to escape the "trap".

&mut does not implement Copy (nor Clone), so if you change columns to &mut 'a [ColumnSpec], you'll get a lifetime error instead. In that case, self.columns must be borrowed from self (you can't avoid the "trap"), and that borrow cannot be longer than the one of &mut Self.

If you make things more explicit to avoid inference, the error in the &mut case becomes somewhat more clear.


Does this help your understanding any? I feel I more described "why does the working version work" rather than "why doesn't the non-working version work", so I'm not confident I'm actually helping...


The idea that the compiler should point this out is an interesting one. I think there's a few reasons why it doesn't:

  • &mut to & is usually a pretty significant change
  • Changing the method signatures of a separate type is an even more significant change
    • And your OP would need both changes in order to compile
  • If &mut were Copy, that may also be a solution, so being able to make the suggestion is somewhat indirect

I.e. this may be too high-level of a suggestion for the compiler to confidently make.

1 Like

Compilers aren't smart enough. They find a problem in a function and reason it is something incorrect in that function; rather than reason that a perfectly valid function it calls has not been written to be useful.