What's the best approach to implement a nested iterator?

I was supposed to post a rust playground link but the website broken in my computer, and this issue can be reproduced by following code:

(The code can be pasted and compiled directly, I have simplified the logic and added some comments)

use std::sync::{Arc, RwLock, RwLockReadGuard};

fn main() {
    let table = Table {};
    let mut iter = TableIterator::new(&table);

    // error: `iter` does not live long enough...
    // let a = iter.custrom_next();

    // compiled successfully
    let a = iter.next();
}

// Page stores the actural data.
struct Page {}

// Table stores the schema. But it doesn't own the `Page` object.
struct Table {}

struct PageIterator<'page> {
    page: RwLockReadGuard<'page, Page>,
}

impl PageIterator<'_> {
    fn new(page: &Page) -> Self {
        todo!()
    }
}

struct TableIterator<'page> {
    // A `TableIterator` doesn't own the `Page` object, but it borrows
    // it with an `Arc<RwLock<Page>>` object.
    page_rc: Arc<RwLock<Page>>,

    // Used to create `page_it` when current page is exhausted.
    page: RwLockReadGuard<'page, Page>,

    // The lower/lended iterator.
    page_it: PageIterator<'page>,
}

impl<'table, 'page> TableIterator<'page> {
    fn new(table: &'table Table) -> Self {
        todo!()
    }
}

impl Iterator for TableIterator<'_> {
    type Item = ();

    fn next(&mut self) -> Option<Self::Item> {
        todo!()
    }
}

// We create a new trait since we have to add a lifetime symbol to the
// `next` method.
pub trait UpperIterator<'this> {
    type Item;

    fn custrom_next(&'this mut self) -> Option<Self::Item>;
}

impl<'this, 'tx, 'table, 'page> UpperIterator<'this>
    for TableIterator<'page>
where
    'this: 'page, // This constraint is required since `self.page` is borrowing
                  // from `self.page_rc`.
{
    type Item = ();

    fn custrom_next(&'this mut self) -> Option<Self::Item> {
        {
            // if the current iterator didn't reach the end, return the next
        }

        {
            // if the current iterator reached the end, create a new iterator
            // e.g:
            // step 1: get the next page
            // self.page_rc = Arc::clone(&self.page.next_page);

            // step 2: get the read lock (borrowing)
            self.page = self.page_rc.read().unwrap();

            // step 3: create a new iterator
            self.page_it = PageIterator::new(&self.page);
            todo!()
        }
    }
}

If we uncomment the code let a = iter.custrom_next();, it will cause compile error:

error[E0597]: `iter` does not live long enough
  --> src/main.rs:8:13
   |
8  |     let a = iter.custrom_next();
   |             ^^^^^^^^^^^^^^^^^^^ borrowed value does not live long enough
...
12 | }
   | -
   | |
   | `iter` dropped here while still borrowed
   | borrow might be used here, when `iter` is dropped and runs the destructor for type `TableIterator<'_>`

I racked my head but still can't get it around this error.

And I also checked some resource about "GAT" and "drop check", but things look like being overcomplexed.

My curiosity is:

  1. Does my implementation of these structs reasonable? Or does it follows the best practice of rust?
  2. What's the best (most ergonomic) solution for my scenario? (It's not the ordinary iterator which owns/consumes the data owner, nor the LendingIterator in those "GAT" articles, as the caller of it doesn't hold page neither.)
  3. If my solution is the best, what it actually means by the compile error, and how to fix it?

In a nutshell, there are something which made it tricky:

  • Both upper level iterator TableIterator and lower level iterator PageIterator don't own any data.
  • The caller of TableIterator also doesn't own the actual data. More specifically, page_rc was retrieved though the global page cache, in the form of Arc<RwLock<Page>>.
  • We have to add 'this: 'page constraint to TableIterator to make sure the TableIterator lives longer than Page, otherwise self.page is invalid (may dangle?, I'm not sure).

You are trying to build a self-referencing struct. This is not supported by Rust. Either try to avoid the need to combine e.g. fields like page_rc and page where one is supposed to borrow from the other into the same struct, or try to work with crates that support self-referencing structs, such as e.g. ouroboros - Rust

2 Likes

This is the way itertools solved a similar problem: Itertools in itertools - Rust

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.