So, it sounds like you haven't had much experience writing custom iterators. It seems to intimidate people for some reason, but I also think Rust makes it pretty easy.
Mechanical start for any iterator
Start with a skeleton:
struct MyIter;
impl Iterator for MyIter {}
Copy, paste, and adust from the error output:
struct MyIter;
impl Iterator for MyIter {
type Item = String; // or whatever; presumably you know what you want
fn next(&mut self) -> Option<<Self as Iterator>::Item> {
todo!()
}
}
And now "all" you have to do is supply the state fields and logic. (And once you've written a few, you'll have memorized the pattern and can just slam out the framework without consulting the compiler errors.)
Probably the most challenging part is getting in the habit of maintaining state so you can return one item at a time.
Using a slice for your state is usually pretty straight-forward as a slice is sort of a super-iterator (easy to split off elements or subslices, random access, et cetera), and that's what I started with. The state is just remaining elements, and the return was some prefix of the slice. I half-ignored your OP here really, by returning borrowed values versus an owned Vec<Vec<_>>
. The core of the idea was really "it's straightforward to split a slice based on the sum exceeding some bound".
let mut sum = 0;
match self.remaining.iter().position(|foo| {
sum += foo.weight;
sum > self.limit
}) {
Then in this post, I addressed your additional constraints. I'll recreate the process as best as I remember.
First I iterated on my original playground to handle the arbitrary type with a "getter" closure. Clearly, Foo
would have to be replaced by a type parameter, and we'd need another for the closure too (if I was going to keep using the slice as my state, which would be the most direct change).
struct GroupBySum<'a, T, F> {
limit: usize,
remaining: &'a [T],
func: F,
}
Adjusting the Iterator
implementation was pretty much again a mechanical exercise that resulted in the implementation that calls the closure multiple times on overflow elements. It didn't even occur to me on the first pass. The state is still "slice of unreturned elements".
I used FnMut
as it's the most flexible for callers, like someone else covered above. You rarely need the flexibility of Fn
in my experience, and this allows callers to use stateful closures without resorting to interior mutability.
Next I moved on to the "make this work for any iterator, not just slices" constraint. I found it clear from the start that a Vec
would be needed; that's pretty much what comes out of "collect into groups of dynamically determined sizes (our return Item
) from an element-by-element iterator". So I started with
struct CollectBySum<Iter, F> {
limit: usize,
inner: Iter, // replaces the slice
func: F,
}
And because of the nature of the problem -- take items until overflow and then group everything except the overflow item -- my first thought was to use a peekable iterator. I think I started writing that even, but that's when I realized that I would be calling the closure on the overflow element more than once. I could have continued anyway by just storing the overflow size, but decided it would be more straight-forward to just forego peeking and store the element myself as well.
struct CollectBySum<Iter, Item, F> {
// ...
overflow: Option<(Vec<Item>, usize)>,
}
I think the reason I went with Vec<Item>
is it fell out naturally from writing
// Start with the overflow from the last call, or with nothing
let (mut storage, mut sum) = self
.overflow
.take()
.unwrap_or_else(|| (Vec::new(), 0));
Without the slice, position
and slice splitting isn't an option, so I wrote the for
loop and moved the "empty or not tail" check to the end.
Then I went back and modified the slice version to not call the closure more than once on each element by following the same concept of take
-ing the overflow or not using slices and an offset of 1 or 0.
Aside from backtracking on Peekable
and realizing I'd made an arguable goof by calling the closure on the same element more than once initially, I think the biggest stumbles were getting my generic parameters declared in the right spots. I remembering getting an "unconstrained type parameter" on some attempt but not which one or where (Iterator
implementation or helper / IteratorExt
trait).
Now to try and connect this back to your process question. The process I went through was iterative (heh), so it was more like "replace &[Foo]
with &[T]
", and then "replace &[T]
with Iter: Iterator
". But if I had started with the ending constraints, it probably would have gone mostly like you describe. Many custom iterators do actually wrap another iterator, so it's pretty common to have
struct MyIter<Iter> {
inner: Iter,
}
impl<Iter: Iterator> Iterator for MyIter<Iter> {
type Item = SomethingInvolving<Iter::Item>;
fn next(&mut self) -> Option<Self::Item> { todo!() }
}
Or perhaps
impl<Iter: Iterator<Item = ConcreteType>> Iterator for MyIter<Iter> {
type Item = SomethingInvolving<ConcreteType>;
fn next(&mut self) -> Option<Self::Item> { todo!() }
}
(And this common pattern of wrapping an inner iterator is also why I likened slice-based versions to having a "super-iterator" for state.)
So my first stab probably would have looked like this.
- What do I need for state?
- The limit
- An inner iterator
- A closure to get a
usize
out of an &Iter::Item
- That gives the struct (leave off the bounds)
- What bounds do I need on the implementation?
Iter: Iterator
F: FnMut(&Iter::Item) -> usize
When moving on to managing the state in next
, the need for dealing with the overflow item without returning it would inevitably come up. With the direction I took, Item
ended up as a parameter to CollectBySum
and that changed the bounds a bit. If I had gone with Peekable
and only stored a usize
for overflow, I don't think the bounds would have changed. Or I could have added bounds to the struct definition and used Iter::Item
instead of type parameter, but I guess I'm in the habit of avoiding bounds on structs strongly enough that I don't think it occurred to me at the time.
These adjustments from the initial skeleton are more general Rust adjustments than Iterator
specific adjustments, IMO.
Once I understood the goal, I didn't consider or try to adapt your imperative code, I just thought of how to achieve calculation of each item (group within a given sum). Again, this might be the biggest challenge of getting the hang of iterators -- thinking in terms of maintaining state so you can generate one item at a time.
That said, in this case the end result wasn't really that far off from your imperative code, looking back at it.