Implementing a lazy iterator (code review)

Hi,
currently I'm writing a crate for conversion between Semantic Web's RDF data and Rust types. The idea I'm stugglling with is the conversion of complex types into a semantic representation, i.e. a bunch of RDF-triples. Since the creation of those triples could be expensive I want to return an iterator that lazily creates the triples on demand. Therefore, I wrote the following code.

use sophia::term::{Term, TermData};
use std::collections::VecDeque;

pub type TripleCreator<'a, TD> = Box<dyn 'a + Fn(Term<TD>) -> [Term<TD>; 3]>;

pub trait GetTriplesIter {
    fn triples_iter<'a, TD>(&'a self, s: &'a Term<TD>) -> TriplesIter<'a, TD>
    where
        TD: TermData;
}

pub struct TriplesIter<'a, TD>
where
    TD: TermData,
{
    s: &'a Term<TD>,
    creators: VecDeque<TripleCreator<'a, TD>>,
}

impl<'a, TD> Iterator for TriplesIter<'a, TD>
where
    TD: TermData,
{
    type Item = [Term<TD>; 3];

    fn next(&mut self) -> Option<Self::Item> {
        self.creators
            .pop_front()
            .map(|creator| creator(self.s.clone()))
    }
}

The plan is that implementors collect a bunch of closures that are executet on call of next(). It is intended that TriplesIter (and its closures) borrows self immutable so it can't be altered while TriplesIter exists.

Would you consider this an idomatic solution? Some suggestions for impovements in terms of performance and/or ergonomics?

While trying to implement GetTriplesIter-trait on some types I encountered a problem I do not quite understand:

impl GetTriplesIter<Arc<str>> for SomeType {
    fn triples_iter<'a>(&'a self, s: Term<Arc<str>>) -> TriplesIter<'a, Arc<str>> {
        let mut creators = std::collections::VecDeque::new();

        let deps: HashSet<_> = /* create some Terms based on structs contents */;
        deps.into_iter().for_each(|dep| {
            let c = Box::new(move |sub: Term<Arc<str>>| {
                [
                    sub,
                    /* get the right predicate */,
                    dep.clone(), // without this clone() it doesn't compile!
                ]
            }) as _;
            creators.push_back(c);
        });
        TriplesIter { s, creators }
    }
}

As you can see a clone() on the temporary dep is required. I thought this wouldn't be necessary because the dep is moved into the closure?

each closure c / creator is intended to be called only once each, but you have used the Fn trait, which requires that it be possible to call the closure multiple times (and concurrently). Since the closure returns an owned HashSet, it needs to .clone() to be able to give multiple owned HashSets, else it wouldn't be Fn (just FnOnce).

Given your use case, using Fn instead of FnOnce in the TripleCreator type alias does not seem to be required, so you can loosen the trait requirement by using FnOnce instead, at which point .clone() should no longer be required.

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.