Lifetime issue with custom iterator traits

I ran into a lifetime issue when doing something like this:

trait IntoRowIterator<'a> {
    type Row: 'a;
    type RowIter: Iterator<Item = Self::Row>;

    fn into_row_iter(&'a self) -> Self::RowIter;
}

trait IntoRowChunkIterator<'a>: IntoRowIterator<'a> {
    type RowChunk: IntoRowIterator<'a, Row = Self::Row>;
    type RowChunkIter: Iterator<Item = Self::RowChunk>;

    fn into_row_chunk_iter(&'a self, chunk_size: usize) -> Self::RowChunkIter;

    fn collect_chunks(&'a self, chunk_size: usize) -> Vec<Vec<Self::Row>> {
        self.into_row_chunk_iter(chunk_size)
            .map(|x| x.into_row_iter().collect::<Vec<_>>())
            .collect::<Vec<_>>()
    }
}

This fails to compile:

   |
10 | trait IntoRowChunkIterator<'a>: IntoRowIterator<'a> {
   |                            -- lifetime `'a` defined here
...
18 |             .map(|x| x.into_row_iter().collect::<Vec<_>>())
   |                      ^----------------                   - `x` dropped here while still borrowed
   |                      |
   |                      borrowed value does not live long enough
   |                      argument requires that `x` is borrowed for `'a`

I thought I could solve this by introducing a second lifetime 'b on IntoChunkIter to distinguish the lifetimes of chunks and items, but I haven't managed to make any variations of this idea work.

I was hoping, therefore, that someone could help me understand why this fails and how to fix it?

(For clarity, my actual use case revolves around iterating over the rows and chunks of some newtypes over ndarray::ArrayView, which is why I don't want to simply implement IntoIterator, since that already has a natural interpretation as turning the struct into a iterator over elements, rather than rows. I'm including a little bit of context in the playground link below. I also don't particular care about collecting into a Vec<Vec<_>>, but I believe that example illustrates the lifetime problem in general.)

(Playground.)

Lifetime on self is a red flag. It's almost never needed, and very often an error.

IntoRowIterator<'a> means that object implementing this trait holds data that has been created before it. That 'a is pointing to somewhere external, that existed before self, and will probably keep existing for longer than this object.

&'a self promises that this object lives for as long as 'a, so combined with the trait it means that self has been created before self has been created, and self may live longer than self.

For & borrows the borrow checker solves this paradox by implicitly shortening the 'a lifetime to be equal to self's lifetime, but still the whole annotation is highly suspicious.


The into_ functions imply destroying the thing they've been called on. Ownership of the content is passed on to a new object that will be torn into pieces as it's being iterated.

So your self.into_row_chunk_iter() means self-destruction, and makes every item returned orphaned, and temporary only for the duration of the next call.

However, your other into_row_iter has a lifetime that refers back to x. The x that has been orphaned and is about to be destroyed.

And it's trying to collect the items that are references back to the x, which is supposed to live in self, except the into_* call is in the process of destroying self.

So:

  1. Don't use into_ calls in &self methods. They're fundamentally incompatible. Use iter() instead, which borrows objects and leaves them where they were instead of ripping them out of their container.

  2. If you use into_, then do it consequently, and use self methods, and owned objects — no references. These two don't mix. You can't end lifetime of something with an into_* and still keep borrowing it where it was.

3 Likes

Thank you very much for the detailed answer, I've read it a couple of times and it's slowly improving my mental model of the different between &'a T and T<A>, I think.

I need something along along the lines of your option 1. If I understand correctly, I believe that perhaps my problem comes from the fact that I want to leverage the ArrayBase::axis_iter/ArrayBase::axis_chunks_iter methods, which takes references to return iterators over non-references. That is, I want to do:

use ndarray::prelude::*;

trait IntoRowIterator {
    type Row;
    type RowIter: Iterator<Item = Self::Row>;

    fn into_row_iter(&self) -> Self::RowIter;
}

impl<'a> IntoRowIterator for ArrayView2<'a, i32> {
    type Row = ArrayView1<'a, i32>;
    type RowIter = ndarray::iter::AxisIter<'a, i32, Ix1>;

    fn into_row_iter(&self) -> Self::RowIter {
        self.axis_iter(Axis(0))
    }
}

trait IntoChunkIterator: IntoRowIterator {
    type Chunk: IntoRowIterator<Row = Self::Row>;
    type ChunkIter: Iterator<Item = Self::Chunk>;

    fn into_chunk_iter(&self, chunk_size: usize) -> Self::ChunkIter;
}

impl<'a> IntoChunkIterator for ArrayView2<'a, i32> {
    type Chunk = ArrayView2<'a, i32>;
    type ChunkIter = ndarray::iter::AxisChunksIter<'a, i32, Ix2>;

    fn into_chunk_iter(&self, chunk_size: usize) -> Self::ChunkIter {
        self.axis_chunks_iter(Axis(0), chunk_size)
    }
}

Which does not compile due to the lack of a lifetime on &self (I believe), and I cannot find an alternative way to convince the compiler that the lifetimes on '&self' and the returned items match. The same problem on slices, say, seems easier since the lifetime is on the reference, and not generic over the struct.

(Playground.)

First, which is not strictly a technical issue, but something confusing: you're using into in names. In Rust into is a naming convention with a specific purpose of passing exclusive ownership of the data to the function. It's the opposite of taking a reference. And you're using references! &self is a reference (a temporary borrow of an object, without passing ownership).

If you have an into_something function, it's supposed to take self, and never &self. Otherwise it's as weird as fn takes_a_string(number: i32).

For making iterators that are not destroying their containers, Rust uses fn iter(&self).

ArrayView2/AxisIter is not an owning object, not an "into" iterator, but a borrowing view. It doesn't contain any elements. The elements stay with the original container, and it only points to them. So you can't make it into an "into" iterator that destroys the container and steals all its contents, because ArrayView2 is already by design making it impossible.

Just naming your methods and traits without Into/into_ would be fine.

2 Likes

Well, it turns out this is a case where explicit lifetime on self is needed! (I'm surprised, because usually the compiler incorrectly suggests putting a lifetime on self when it's trying to suggest a solution to an unsolvable problem).

That's because self.axis_iter() returns an AxisIter that is bound to the lifetime of the self borrow in that function call. By default when you call a method, it (re)borrows self again, so things returned from that call are valid only within the scope where the call happened, so into_row_iter(&self) could be a shorter-lived borrow only valid where that method was called, and not necessarily valid later where the result is used.

So to force axis_iter to return AxisIter<'a, …> it is necessary to call axis_iter on &'a self.

Unfortunately, lifetimes on the implementation must match lifetimes on the trait, so the trait has to get an awkward seemingly-useless lifetime annotation, just so that you can refer to it later in the implementation.

trait RowIterator<'a> {
    type Row;
    type RowIter: Iterator<Item = Self::Row>;

    fn row_iter(&'a self) -> Self::RowIter;
}

impl<'a> RowIterator<'a> for ArrayView2<'a, i32> {
    type Row = ArrayView1<'a, i32>;
    type RowIter = ndarray::iter::AxisIter<'a, i32, Ix1>;

    fn row_iter(&'a self) -> Self::RowIter {
        self.axis_iter(Axis(0))
    }
}
1 Like

Great visual for into... into pieces. I love it :))

1 Like

Thank you very much for the continued explanation, and for the point about the naming conventions. I'll be dropping the Into prefix. (I also edited the title to reflect this.)

One more question, if I may: Am I right in thinking that re-introducing the &'a self lifetime also reintroduces the original lifetime problem? I.e., if I take your implementation of RowIterator and add ChunkIterator trait akin to the original, I get:

use ndarray::prelude::*;

trait RowIterator<'a> {
    type Row;
    type RowIter: Iterator<Item = Self::Row>;

    fn row_iter(&'a self) -> Self::RowIter;
}

impl<'a> RowIterator<'a> for ArrayView2<'a, i32> {
    type Row = ArrayView1<'a, i32>;
    type RowIter = ndarray::iter::AxisIter<'a, i32, Ix1>;

    fn row_iter(&'a self) -> Self::RowIter {
        self.axis_iter(Axis(0))
    }
}

trait ChunkIterator<'a>: RowIterator<'a> {
    type Chunk: RowIterator<'a, Row = Self::Row>;
    type ChunkIter: Iterator<Item = Self::Chunk>;

    fn chunk_iter(&'a self, chunk_size: usize) -> Self::ChunkIter;

    fn collect_chunks(&'a self, chunk_size: usize) -> Vec<Vec<Self::Row>>{
        self.chunk_iter(chunk_size)
            .map(|x| x.row_iter().collect::<Vec<_>>())
            .collect::<Vec<_>>()
    }
}

impl<'a> ChunkIterator<'a> for ArrayView2<'a, i32> {
    type Chunk = ArrayView2<'a, i32>;
    type ChunkIter = ndarray::iter::AxisChunksIter<'a, i32, Ix2>;

    fn chunk_iter(&'a self, chunk_size: usize) -> Self::ChunkIter {
        self.axis_chunks_iter(Axis(0), chunk_size)
    }
}

Which does not compile:

   |
19 | trait ChunkIterator<'a>: RowIterator<'a> {
   |                     -- lifetime `'a` defined here
...
27 |             .map(|x| x.row_iter().collect::<Vec<_>>())
   |                      ^-----------                   - `x` dropped here while still borrowed
   |                      |
   |                      borrowed value does not live long enough
   |                      argument requires that `x` is borrowed for `'a`

Am I simply facing the fact that this pattern is not expressible with the signatures of the axis_iter/axis_chunks_iter methods in ndarray?

(Playground.)