Efficiency of `.contains` on `Range::into_iter()`

Well that's what Range does, really: it .clone()s the start each step to give it to you. (Of course, when you're using it with i32 or similar nobody cares about that "cost" since it doesn't really exist.)

The iterator trait generally expresses that with adapters. For example, Iterator::cloned takes an iterator over &T and gives you an iterator over T instead by cloning the items.

Ehmmm. Ok. Let's talk about Ownership in this context. The Range iterator does several steps in this context:

  1. It creates a Value of your certain type and allocates Memory for it. An i32 would need 32 bit for example.
  2. It obtains the Ownership over it and gives it to you.
  3. If nothing done with that ownership, the value simply exceeds it's lifetime and destructs itself.

So, if you're only storing the values of a Range iterator that you're collecting, As Rust is a compiled language, the compiler will try to determine how many values a Range iterator call needs to collect by explicitly calling the collect() function. All collected values 'outlives' the iterator and therefore require their own memory allocation. For example, if you use your Range iterator as a simple for loop, you only need to allocate memory for one value of your given type because every new value generated by the Range iterator uses the memory of its previous user because it has already been destroyed. The memory cost is determined by what you do with the iterator. For example, if you call an iterator on a [u32; 8] and collect it using collect::<Vec<u64>>(), the compiler will allocate more than double the amount of memory needed for your field because the values and the vector pointer at the end of the iterator call are larger.

However, there are cases in which allocating memory for a value itself can reduce performance, for example if the value is extremely large or is always put on the heap. In these cases, you could put every value into a smart pointer, such as an Arc, and instead of letting the value destroy itself, simply throw away its pointer. Bear in mind that this kind of performance optimisation must be considered precisely, because it could also lead to a performance decrease if the pointer itself is larger than the value or if there are too few values.

Thanks. There's a lot to unpack in that, I'll take some time to think it through. Fundamentally, though, I think my mental picture of Rust's memory model is too simple. I'm off to find some docs on that subject, as well :slightly_smiling_face:

Hmm, I'd never heard of Iterator::cloned. I'll go take a look. How do people learn all these methods? There's just so many variations and I've yet to find a good "which one should you pick" overview. It's like into vs from vs to_owned, etc. I've yet to get beyond the "keep trying alternatives until you find the one the compiler doesn't reject" approach :slightly_smiling_face:

Right, that's my next project, I guess!

Thanks to everyone who's commented - this has been a really interesting and informative discussion.

You just learned one :slight_smile:. Or you see it in the wild and look it up. Or you wonder if what you want exists already and try to look it up. (And if it's not in std you go check itertools.)

Alternatively, take an hour or two and read or skim through them all. You don't have to memorize them, just plant some seeds about what's available. To spot adapters like .cloned,[1] look for methods that have a ⓘ by the return type:

fn cloned<'a, T>(self) -> Cloned<Self>

The text and examples will tell you in practical terms what the adapter does. Alternatively, if you hover the ⓘ, it will tell you some details about the adapter's Iterator implementation (like what the Items are):

impl<'a, I, T> Iterator for Cloned<I>
where
    T: 'a + Clone,
    I: Iterator<Item = &'a T>,    
    type Item = T;

Which can be nice if you're more technically minded -- i.e. with some practice,[2] you can decipher the signature and bounds yourself. In this case: ".cloned() turns an iterator over &Ts into an iterator over Ts, provided T: Clone."


  1. versus methods that consume the entire iterator ↩︎

  2. compare with the examples ↩︎

Most of these methods are things you could do yourself, just with more work. For example, .cloned() is equivalent to .map(Clone::clone). So, a path to learning is:

  1. Write verbose, simple code.
  2. Note common patterns.
  3. Consult the standard library for things that fit those patterns.

More abstractly, the key to knowing how to pick the right standard library function is knowing what your types mean in the first place. If you know the difference between &i32 and i32, then you know that to get i32 from &i32 is a clone or copy, and so you can either write the verbose code or look for adapters that do that for you, and in either case, you’re not stuck with trying things at random.

With practice and reviewing the documentation, you will eventually simply have most of them memorized, just like you know what the English words “people” and “methods” mean and you don’t have to look them up.

To point you in possibly the wrong direction, the somewhat complex "Generic Associated Types" (or GATs) feature was partially motivated by the ability to define a "lending iterator" that was permitted to return a reference into the iterator itself, which blocks the implementation of methods like collect.

Unfortunately, while the trait itself is relatively simple in it's base form:

pub trait LendingIterator {
    type Item<'a> where Self: 'a;

    fn next(&mut self) -> Option<Self::Item<'_>>;
}

there's AFAIK no current plans to add it to the standard and it can't be directly implemented on Iterator types without conflicting with any manual implementations; so you're not actually saving much effort going this way.

As a rule of thumb; if you're trying to implement a trait and you're wondering if you need GAT, you're probably overdoing something. A GAT should either be incredibly obvious to you or this trait is basically the whole point of the crate you're writing. It's far too easy to think you're solving some immediate issue but you're actually just pushing it further down into an implementation where the underlying reason is even less clear.

If you take all these together...

...then it may seem like Iterator implementations can only give you ownership. And in some sense, this is true -- the Iterator trait does not allow an implementor to hand out references to its own fields (that would be a lending iterator). But then you may ask, how can Iterator implementations that hand out references (or borrowing types more generally) exist?

Those can work because they're not handing out borrows of[1] things that they own, like their own fields. Those iterators are, themselves, borrowing from the actual owner -- and they're handing out copies or parts of their borrow in their Iterator implementation.

For example, you can do this:

    let mut array = [0, 1, 2];
    // We create a `&mut [i32]`.
    let initial_borrow = &mut array[..];
    // We split the original borrow to create `first` (a `&mut i32`)
    // and `last_two` (another `&mut [i32]`).
    let [first, last_two @ ..] = initial_borrow else { unreachable!() };
    // And so on for the other elements.
    let [second, last_one @ ..] = last_two else { unreachable!() };
    let [third, _empty @ ..] = last_one else { unreachable!() };

    // These split up borrows can all exist at the same time.
    *third += 1;
    *second += 2;
    *first += 3;

Which is more or less how &mut [T] iterators work.

One way to think of this is that the iterator is giving you ownership of the &mut Ts (but not the Ts).


  1. e.g. references to ↩︎

A slight variation on kpried's answer:

  • write it myself
  • clippy tells me if there is a method

(I have clippy as the linter used by rust-analyzer in my IDE, I think that's now the default)

If I write a simple .map() with a closure that can be provided by a standard method clippy usually has a lint that tells me!

Rust is the first (and only language) where I would list "compiler errors" and "default linter errors" as two of the best ways to learn!

A few of the posts here keep touching on something that still unconsciously struggle with in rust - remembering the power of zero-cost-abstractions.

I'm so used to "if I code it then the CPU will do it" that I still keep catching myself forgetting that the rust compiler will analyse my code and optimise it from "good to read" to "good to execute" in all but the most extreme of cases. (zig makes a feature of the exact opposite)

For example:

"calling into_iter() on an Iterator is a no-op" - in my head this translates to, "it does nothing, other than burn a cpu-cycle, which usually doesn't matter". In rust it translates to "the compiler will throw away the call, so you can add it to make your code more generic without any extra cost".