Generic `Iterator` with specific `Item` as function parameter

I'm trying to work out how to write functions that take generic iterable collection parameters, and it's been very tricky to figure out how to structure them. I've worked out some of the cases by trial and error, but I'm stumped at one. I want to be able to call a function like this:

let result = analyze_items(&collection);

…or perhaps:

let result = analyze_items(collection.iter());

…which I find less aesthetically pleasing (because it's not as clear that collection is not consumed in the process), but acceptable.

I've experimented with many different variations, and found a few things that work. First, there's the case where I want to accept only one type of collection:

fn analyze_items1<T>(collection: &Vec<T>)
where
    T: std::fmt::Display,
{
    for item in collection {
        println!("{}", item);
    }
}

This one is nice in that I can just pass &collection as the parameter, rather than explicitly calling collection.iter().

I also found a way to make both Iterator and Iterator::Item generic:

fn analyze_items2<T>(collection: impl Iterator<Item = T>)
where
    T: std::fmt::Display,
{
    for item in collection {
        println!("{}", item);
    }
}

This requires calling collection.iter(), rather than just passing a reference to the collection to the function, but it does work.

Last, for certain types, I can specify a concrete type for Iterator::Item:

fn analyze_items3<'a>(collection: impl Iterator<Item = &'a i32>) {
    for item in collection {
        println!("{}", item);
    }
}

Unfortunately, this doesn't work if the Item I want to use it on is of type &str (or any other reference type? I'm not sure). Can anyone tell me how I could rewrite this to operate on a Vec<&str> rather than a Vec<i32>? Or explain a better way to do this in general; I really feel like there should be a way for me to pass a generic collection that implements Iterator by immutable reference to a function, but I haven't been able to find it on my own.

Is this what you want?

fn analyze_items<'a, 'b: 'a>(collection: impl IntoIterator<Item = &'a &'b str>) {
    for item in collection {
        println!("{}", item);
    }
}

fn main() {
    let v = vec!["a", "b"];
    analyze_items(&v);
}
1 Like

You can:

fn analyze_items4(collection: impl IntoIterator<Item = impl Display>) {
    for item in collection {
        println!("{}", item);
    }
}

Which may or may not consume.

See also the docs for IntoIterator and how it works with for loops.

It's typical for all of

2 Likes

My understanding was that IntoIterator was meant for consuming the collection, thus not what I wanted. The idea of a reference to a collection implementing IntoIterator doesn't make much sense to me. If it's not consuming, why is it called IntoIterator instead of just Iterator?

Because it is not an iterator, but it can be turned into an iterator.

1 Like

It does consume the reference, when the reference is the implementing type. That's the signature:

// Consumes `self`
fn into_iter(self) -> Self::IntoIter;

In the case of a shared reference, this may just consume a copy, and thus may not seem like consuming at all -- which is also true when you're copying integers around, say. &mut is not Copy, so it is consumed in a way more in line with your apparent intuition. (Though you could reborrow. It's less flexible than a copy as the original borrow cannot be used while the reborrow is alive.)

But they all follow the same pattern of the trait -- the implementer of the trait is consumed. That's just not always the collection itself.


Perhaps you're really asking, why do the references implement IntoIterator? Well, the iterators they can turn into each handle a different use case:

  • &Collection can turn into an iterator over &Item
  • &mut Collection can turn into an iterator over &mut Item
  • Collection can turn into an iterator over Item

Owning, shared borrowing, and unique (mutable) borrowing are the cornerstones of Rust's ownership story, and these implementations can be very ergonomic. (And the playground I linked before is another example of not only ergonomics, but expressiveness.)

1 Like

That was the non-intuitive bit. I have a tendency to think of references as something other than objects, rather than objects in their own right. In this case, it's exacerbated by the fact that it's just the single reference to the whole collection that's being consumed, whereas Vec<i32>::into_iter() consumes each of the items. Now I can adjust my mental model so that it makes sense. However…

Not really; I'm more interested in why they don't implement Iterator (or why we're using IntoIterator instead here if they do). Is there no way to have the reference to the collection not consumed, as well? Say I wanted to pass a reference to a collection to a function, then call two other functions, each of which wants an immutable reference to that same collection? With IntoIterator, I'd need to make a copy of the reference first, but if there was an Iterator version, I wouldn't need that extra step. I still feel like what I want here should be doable with Iterator, and would make more sense that way.

Something that implements Iterator needs to have some mutable internal state that changes during iteration and keeps track of what the current item is. A reference to a collection does not do this. You can iterate over the same collection with many iterators (built from copies of a shared reference) in parallel if you want. Every one of those needs its own mutable internal state, which is why you need to construct these iterators first. The conversion done by the IntoIterator implementation is just a constructor of such an iterator, it bundles up the reference into the collection with some additional state that's necessary during the iteration.

I honestly don't understand what you have in mind of how Iterator could help here. If you want to duplicate the (reference to the) collection in a generic function, you can do this, with different levels of generality. Least general, accept a &'a C for some generic collection type C with a where &'a C: IntoIterator<Item=&'a ...> bound. More general, abstract over the reference in that it implements Copy, i. e. accept C with C: IntoIterator<Item = &'a ...> or something like that, and an additional C: Copy bound. Even more generally, make it C: Clone which allows passing in collections that are more costly to duplicate (though note that there are a lot of types that are still quite cheap to clone). In the last case, make sure to clearly document to the user of your function what is cloned how often under what condition, so they can estimate cost of passing in expensive-to-clone collections.

2 Likes
function1(&collection);
function2(&collection);
function3(&collection);

Or:

let reference = &collection;
function1(reference);
function2(reference);
function3(reference);

Now that I've had a chance to play around with this, it does seem like what I wanted. The bit that I don't understand here is the second generic lifetime <'a, 'b: 'a> : What does 'b: 'a mean here? It appears to work just fine if I specify only the one lifetime, and use IntoIterator<Item = &'a &'a str>. I suppose the first one is the lifetime of the reference to the Item, and the second one is the lifetime of the string slice referred to? In which case, this looks like it might be backwards and should be &'b &'a?

Yes. In &'a &'b str the first reference is to the second reference, and the second reference is to the string slice. The constraint 'b : 'a is correct. It means that the second reference points to a string that lives at least as long as what the first reference points to, i.e. that the string lives at least as long as the reference to the string.

1 Like

The approach with two lifetimes is more general. “'b: 'a” means “'b outlives 'a”, so in other words, it restricts the lifetime 'b to be longer than (or equal to) the lifetime 'a. This restriction is necessary for the type &'a &'b str to be a valid type.

For the record the <'a, 'b: 'a> syntax is like <'a, 'b> (so it introduces two lifetime arguments) plus a separate where 'b: 'a bound / constraint. Just like for type parameters with a trait bound, the introduction and some bounds on the parameter can be syntactically combined.

1 Like

That clears it up a whole lot, thank you. It still seems a bit confusing, because we can get an iterator from a collection by calling collection.iter(), but we can also just use the collection itself as an iterator (at least in the case of Vec). Am I right that for item in collection {} calls collection.into_iter()?

But apparently, collection_ref.into_iter() doesn't invalidate collection_ref, so neither the object nor the reference is consumed. This makes sense functionally, but it violates the principle of "into" taking ownership, alas.

Too bad it's not written as 'b >= 'a. Thanks!

1 Like

Yes. For more details see the desugaring described in the reference.

Thanks, @tczajka, @quinedot, @steffahn, @chrefr! You've all been very helpful, and I believe I now have a sufficiently complete understanding of the topic. :+1:

1 Like

It becomes manageable and somewhat easy to remember once you start calling the colon the “outlives”-relation (in these cases). Whenever in doubt read it as “left-hand-side outlives right-hand-side”, and it becomes possible to deduce the correct interpretation with a bit of concentration / thinking. The reason why it's this way is for consistency with T: 'a bounds (which translate to a set of 'b: 'a bounds for all lifetimes 'b syntactically appearing in the type T) which are pretty similar to (and relate to / can even be combined with) trait bounds. So there's 3 interpretation of the colon, depending on whether there are

  • lifetimes in both sides
  • a type on the left and a trait in the right
  • a type in the left and a lifetime in the right

compare e. g. the table B-5 in the book appendix.

Vec doesn't implement Iterator, it implements IntoIterator (which is a "collection" trait). The confusing thing though is that Iterators also implement IntoIterator which just returns the same iterator again. So you can do this:

for i in v.into_iter().into_iter().into_iter()

Shared references are copyable, they implement Copy, which means moving (consuming) them doesn't destroy the original.

2 Likes

I think the way I'll remember the relationship 'b: 'a is that it is akin to T: Trait in that T is a superset of what's defined to satisfy Trait and the lifetime 'b is a superset of the lifetime 'a.

If Vec was it's own iterator in particular, it would have to carry around some extra state keeping track of how many items it had returned off the front always. [1] But that's a different data type -- a VecDeque.

Even then though, the standard library doesn't do this; an iterator is always a distinct type from the collection [2]. Arguments for why this is a good thing include


In contrast, one group of standard types that are Iterators while simultaneously serving other primary roles are Ranges of various types, for example a Range of integers. However, because it is considered a footgun for an iterator to also be Copy, a Range of integers does not implement Copy. (It's a footgun because if you can easily accidentally copy your iterator, you can easily keep around logically stale state.)

This causes various levels of consternation, as you can read in this issue and the things it links to.

Like VecDeque, &[T] could be an Iterator, but it is also Copy. Instead we have slice::Iter, so the conflict is avoided. (The current Iterator implementation is also optimized in ways not necessarily possible using the representation of &[T] itself.)


  1. Or be a non-DoubleEndedIterator that always pops off the end, but that would surprise pretty much everyone. ↩︎

  2. even though for this particular example, it's currently just a newtype wrapping the VecDeque, as you can see by clicking the source link ↩︎