Unboxing Iterator

Tyrn · January 14, 2022, 6:30am

Hi,

I've got this snippet of working code:

use std::path::PathBuf;
type WalkItem = (PathBuf, Vec<PathBuf>);
type WalkIterator = Box<dyn Iterator<Item = WalkItem>>;

fn dir_walk(dir: &PathBuf, step_down: Vec<PathBuf>) -> WalkIterator {
    let (dirs, files) = dir_offspring(dir);
    let step = step_down.clone();

    let walk = move |d: PathBuf| {
        let mut step = step_down.clone();
        step.push(PathBuf::from(d.file_name().unwrap()));
        dir_walk(&d, step)
    };

    let item = move |f: PathBuf| (f, step.clone());

    Box::new(
        dirs.into_iter()
            .flat_map(walk)
            .chain(files.into_iter().map(item)),
    )
}

One has to box the iterator because it is not Sized; there is no other reason, or so I gather. I've heard about the impl Trait thing; it's good for the return type, they say. Experiments with

...-> impl Iterator<Item = WalkItem>

would show compiler errors only. Is the current implementation of dir_walk the only possible solution?

quinedot · January 14, 2022, 6:52am

You're going to need some sort of indirection, since you're defining a recursive type. You could maybe make some concrete iterator that involves the indirection yourself, but IMO, you might as well use what you've already got.

In other words, the recursive definition is another reason for boxing in this case, beyond the un-Sized nature of dyn Iterator.

Tyrn · January 14, 2022, 1:26pm

Suppose I'm too curious : who's doing the linking, and how? Of course, most of the linking happens inside

        dirs.into_iter()
            .flat_map(walk)
            .chain(files.into_iter().map(item))

but I have nothing to do with it. I can't yet see how my boxed iterator serves the purpose. Why it is not concrete and where is its linking point.

H2CO3 · January 14, 2022, 2:10pm

walk is a closure that returns the result of dir_walk so its return type is the same type that dir_walk returns. Hence, that same return value appears in dirs.into_iter().flat_map(walk).

quinedot · January 14, 2022, 9:09pm

By concrete I just mean, try to create some structs to implement these iterators yourself, without type erasure (dyn Iterator).

To try to highlight the link, here's a rough approximation of the unboxed WalkIterator iterator type:

Chain<
  FlatMap<
    IntoIter<PathBuf>,
    WalkIterator,
    Closure#1
  >, 
  Map<
     IntoIter<PathBuf>,
     Closure#2
  >
>

Remembering that iterators are lazy, not eager, what happens when you call next()? ^[1] First, Chain calls the FlatMap. FlatMap calls Closure#1 on the IntoIter, which calls dir_walk and gives back a WalkIterator -- let's call this WalkIterator#0. The FlatMap calls WalkIterator#0 and, assuming the result is not None, it returns the result to you. In the meanwhile, it holds on to WalkIterator#0 because it's going to need to call it again when you call next() a second time.

So at any given point, you're holding onto WalkIterator#0. But what does WalkIterator#0 consist of? The same thing as the type above -- it contains an iterator with a bunch of nested types, one of which is WalkIterator itself. WalkIterator#0 contains a WalkIterator#1 which contains a .... etc. This is how your recursive iterator is the same situation as the Cons type in that chapter of the book I linked to.

You could probably manage your own stack in a Vec instead of recursing, with the Vec would acting like a "fancy Box" to supply your indirection. Or you could keep the recursion and do your own boxing. But again, I really don't think it's worth it; you'll still have most of the indirection. Additionally, you're doing so much cloning of data here for each Item that even if recursive types wasn't a concern, I doubt you'd notice any advantage from removing the indirection at the top.

I guess it could be interesting to implement from a learning standpoint.

Disclaimer: I'm not going to fact check the order of operations here, so some details not pertinent to this discussion may be a little off. ↩︎

system · April 14, 2022, 9:09pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Chaining iterators in a loop without lots of Boxes	6	428	October 20, 2021
Storing an Iterator in a Box help	3	344	November 9, 2022
Why is boxing an iterator allowed? help	5	1118	May 17, 2021
Trying to make sense of functions returning iterator help	6	2461	January 12, 2023
Iterate through copies instead of refferences? help	3	3168	January 12, 2023

Unboxing Iterator

Related Topics