Efficiency of `.contains` on `Range::into_iter()`

I'm writing a program that needs to handle numbers within a specific range. The program itself isn't that important (FWIW, it's calculating "Square loops"), but I want to make it as general as I can, in order to be able to use it for variations on the original problem.

I'm struggling to find the best way to express the type of one of my arguments. Logically, it's a collection of values (i32, typically), and all I ever do with it is extract the "first" value, and test a value for membership.

At the moment, I'm using a RangeBounds<T> as my parameter type, because the typical usage is to pass a range like 1..10. The .contains method is efficient (it's just 2 comparisons), and I can get the first value via .start_bound() (with some juggling to catch ranges with no lower bound). But this limits me to only passing ranges. An alternative would be to declare the type as Iterator<T>, and use .first() and .contains(). If I do that, I'd pass (1..10).into_iter(), which is fine, but my concern is whether that would still use the efficient .contains() implementation that a Range object can support.

To some extent, this doesn't matter - for my current use, it's fast enough either way. But I'm interested in learning how to write good Rust code, not just solve my current problem. And I don't know how I can find out the answer to this question. Any pointers would be very helpful :slightly_smiling_face:

std::range::RangeIter doesn't appear to provide its own implementation of .contains(), nor does std::ops::Range, so it will probably go over all numbers in the range.

You could accept an iterator of ranges, perhaps?

Iterator doesn't have contains or first methods though, so that makes it unclear what you're asking.
If you mean Iterator::find, then that one takes a closure, so we can't optimize it at the library level to do the O(1) thing for integers because we don't know what user-provided closures do.

LLVM still might be able to, let's see what godbolt says:

Nope, still a loop

example[98f869ed581a9870]::contains:
.LBB0_1:
        mov     rax, rsi
        cmp     rdi, rsi
        je      .LBB0_3
        lea     rsi, [rax + 1]
        cmp     rax, rdx
        jb      .LBB0_1
.LBB0_3:
        cmp     rax, rdx
        setb    al
        ret

In these kind of situations I usually feel better if I apply YAGNI rather than DRY.

If, however, you want to explore how to make your API "just right".

  1. Make sure you have a test that covers your current use case
  2. Add another that covers a different case, which you expect to be currently unsupported
  3. Change the signature and code to support this
  4. Tweak them code until it feels "as good as it needs to be right now"
  5. Repeat from step 2

This will probably lead you in the direction of a trait bound on Iterator, Index, OR killing them idea and sticking to the specific case as contains is a direct impl method and but a trait in stdlib.

In other words. The question sounds a little to wide scoped right now. Can you squeeze it down to a specific case that works and a specific one that doesn't?

std::ops::Range does - see here. Or am I misunderstanding the docs?

Huh, that's confusing. I must have mis-transcribed something when I simplified my example. For first it's not so much of an issue, as I can always use next instead[1]. But contains is a bigger problem - I am using contains, maybe I'd left the Range constraint around so the compiler used Range::contains without me realising.

But it's a good point. Any notion of containment on an iterator will exhaust the iterator, so that's not what I intended. I'm translating concepts from Python (my main language) into Rust, and I was probably thinking of iterables, rather than iterators, and then confusing the translation. But really, the equivalent in Python of what I want is Container (which supports containment and iteration). Is there such a trait in Rust? One of the big difficulties I have when writing Rust is finding traits - it's easy to see what traits a particular type supports, but much harder to find what traits exist.

Should I be writing my own trait? That seems a bit over-engineered, but maybe it's my expectations that are too grand. Coming from Python and its duck-typing, I'm used to the feeling that I can use a bunch of methods that I want, and then it's just a matter of making sure that what I pass "conforms" to my expectations. But that may be too much to ask of Rust, where compile-time type checking is so much stricter.

The problem is, it's not really a matter of "working". If we ignore the confusion I describe above, and assume I have the right declarations so that I can call something like contains on my argument, a linear search through the valid values will always work. But for Range, that's unnecessary - you can just check if the supplied value is between the lower and upper bounds. And that's what I'm trying to ensure - that I get the efficient implementation rather than the inefficient one.

Yes, I know premature optimisation is the root of all evil, and all that. But I'm not optimising prematurely here. I have a working program, and if that was all I wanted I'd stop there. But I want to learn how to write efficient and extensible Rust code, and I don't want to wait until I have a huge project with significant efficiency requirements before I learn the basics. So it's not really YAGNI - I don't need it to solve my initial problem, but I do need it to address my learning goals (if that makes sense...)


  1. Although it does advance the iterator, affecting other operations ↩︎

Simplifying things a bit further, but at the cost of making the example more abstract, suppose I have a function that can take either a Vec<i32> or a Range<i32>, and in both cases I want to be able to do two things in the function:

  1. Get the "first" element of the argument (treat ranges like ..2 as having no "first" element, just like an empty vector). Basically, arg.into_iter().next()
  2. Check if a value is contained in the argument, i.e., arg.contains(val).

How do I specify the type of the argument?

There isn't such a trait[1], and as far as I know one of the relevant reasons against having a very general “Container” trait/interface is the vastly different performance characteristics of certain methods between different container types.

Another problem with functionality like a contains method in particular is that efficient implementations of such functionality can depend on traits like Ord (or Hash) whereas linear scans only require [Partial]Eq. The analogous problem exists for example in Haskell, where they have a Foldable trait[2] that offers a function called elem to check for element membership in - basically - any type you can iterate over. But the Set datatype needs to offer its own member function, and using elem on it is actually an anti-pattern / bug, since it's much less efficient.


  1. unless you count IntoIterator, I guess? ↩︎

  2. in Haskell lingo a "typeclass" ↩︎

I fully understand. I was in that situation (from python to rust) 2 years ago and Combining rust & python - FizzBuzz was the result (I challenge anyone not to scream YAGNI at me!). But the whole point was the learning journey.

I still find that "DRY before YAGNI then iterate in little bits" as a mindset helps me, particularly in these situations where I know I'm going to go well past YAGNI.

As for how to work through and find this stuff in rust:

  1. Look for a trait in stdlib - I intutitively searched for "contains" in stdlib docs and ran down the list looking for a Trait - no luck
  2. Decide whether it's worth looking for a 3rd party trait (crates.io & docs.rs are even better for this than pypi). In your case - certainly not the road I'd go down.
  3. Roll your own trait - to get a feel for how this would work. Maybe something like trait Bounded with a default implementation of fn contains(&self, &val) -> bool that calls self.contains(val) then impl that for anything you feel like that is in stdlib and has a contains method. (Others could impl your trait for their types, but stdlib can't.)
  4. Spend some time working around the ideas of overlapping impls when you think about adding a generic impl<T> Bounded for T where T: IntoIterator that uses find.[1] (That's a good part of rust to learn)

I hope this helps, even if it (sometimes deliberately) opens more avenues for questions than it provides answers


  1. The "more correct" solution involves using find in the default solution and the more performant contains in the specific cases. But seriously - don't start there, it won't help with gaining the right understanding. ↩︎

You can of course define a trait for your specific usecase. It could look something like this:

Remember that this can, at least hypothetically, allow Range<f32> as well, where having an Iterator over the values would be really awkward or mean something different.

So I'd say you should think about whether you're actually talking about an interval or a collection.

Well, rather than membership, what about thinking of it as a value and a predicate? Do you actually care about "membership"?

You could leave the range version for convenience, but implement it by calling something that takes range.start and |x| range.contains(&x) as separate parameters, so people who want something more complex than the range can pass the parts separately.

Interesting, thanks. It feels a bit annoying to have to implement the trait for every standard collection individually, but that might just be because I'm used to the way that Python's duck typing just happens automatically (with a runtime cost, of course - you don't get something for nothing :wink:)

That's a very good point, and it makes me realise that my intuitions about good API design are shaped a lot more than I realised by my Python background.

Having 2 arguments (start and predicate) is definitely something I will explore. One thing that concerns me about it, though, is that if overused, the pattern will inevitably result in APIs that have long lists of arguments, which doesn't seem like a good design (my real code already has 5 arguments, this will take it up to 6). Are there any good Rust patterns to keep argument counts under control?

Generally making more types :slight_smile:

For example, if you combine what I said and what @jendrikw said about a trait, it could still be one parameter where you can pass a Range or a CustomStartAndPredicate.

In the limit, you end up at the builder pattern if you have a ton of options, especially when they have good defaults. foo(None, None, 4, None, None) is always horrible, but foo(FooOptions { compression_level: 4, .. }) is fine.

For what it’s worth, you’d almost always want to take IntoIterator rather than Iterator in this case.

I like that as a general piece of advice :slightly_smiling_face:

Thinking some more about the design, I am wondering if the right approach is to have a struct that takes the various parameters for the calculation. Add constructors and configuration methods to set the parameters appropriately, and then have a "do the work" method that actually does the calculation.

That sounds like it might be both flexible and clean. I'll give it a try and see how it looks.

Thanks, good point.

But when I started investigating, I realised that into_iter() takes ownership of self, which means you can only call it once. That makes doing something like iterating over the argument twice impossible. So that leads me back to a variant of my original question - if I have code like this:

fn iter_twice(val) {
    for el in val {
        println!("{}", el);
    }
    for el in val {
        println!("{}", el);
    }
}

how do I type the function so that I could pass either vec![1,2,3] or 1..3 to it, and get the expected answer?

Doing some further experimentation suggests to me that the Range type isn't actually as capable as I'm expecting. In Python, range(1,3) acts essentially like a collection of the numbers 1,2. But in Rust, it seems to be a one-shot iterator (so that it can only be traversed a single time). The new std::range::Range type seems to be a little better behaved - I can call iter() on it multiple times to get an iterator (and I can do the same with Vec). So another approach might be to require std::range::Range, and call iter() within my iter_twice function. But if I do that, I'm back to the question of how to I write the type for val that says it's "something you can call iter() on"?

Sorry, this has become somewhat rambling. It feels like every time I think I'm getting closer to the answer, I hit a different issue. Which suggests I'm not thinking about the problem the right way. Any advice would be highly appreciated, because I genuinely can't see why iter_twice is so hard to write correctly and generically.

Unfortunately, there is no single standard library trait bound that lets you do this efficiently in all cases.

  • If you accept I: IntoIterator<Item = i32> + Clone, then you can .clone().into_iter(). But that allocates a second Vec unnecessarily.
  • If you accept I where &I: IntoIterator<Item = &i32>, then you can iterate twice (equivalent to .iter()) but that doesn’t accept Range.

(The reason this doesn't just work is that, in general, to receive the elements of a Vec by value, the Vec must be consumed, and there is no special case for when the element type is Copy; but a Range doesn't store all its elements in memory, so it can't produce an iterator of references.)

Therefore, there is a tradeoff to be made here.

  • If what you want is, in general, to accept arbitrary iterators, then you can accept I: IntoIterator<Item = i32> + Clone, and leave it to the caller to pass either vec.iter().copied() or 1..3 as the iterator. However, this has the hazard of unnecessary cloning of the Vec.
  • If you really want to accept vec![1,2,3] or 1..3 without further complication and with optimal results, then you need to define your own IntoIterator-like trait, that accepts the input by reference, and implement it for both types.

You can .clone().into_iter() to use it more than once, which has the same effect as using the new Range type.

There is no trait corresponding to iter(), so there is no way to do that. For collections, the idiomatic approach is to require &C: IntoIterator, which works the same as the usual .iter() on collections, but again, this doesn’t apply to ranges because ranges can't hand out references to their elements.

That’s not quite right. In Python, range() collection of values, but a lazy sequence type (an immutable iterable). When you call range(1, 3), it doesn't generate the numbers upfront; it just defines a range from 1 up to (but excluding) 3, yielding values on demand.

Rust takes a very different approach: its iterators are inherently stateful and consume themselves as they are advanced. The core philosophy is to create a structure that steps through data, often by taking ownership of it. This allows you to turn a vector into an iterator, transform the elements step-by-step, and collect them into a new data structure - often again an vector - with zero unnecessary memory overhead. So yeah, they look like a oneshot, because the whole Idea is to use an Iterator in one shot.

Thanks. So it sounds like the issue is less to do with my understanding of iterators (which is weak, but I'd have got there apart from the other matter...) but rather of how Range works.

I can accept how ranges work - they are implemented a certain way, and being upset if it's not how I'd like them to behave is pointless - but I'm not sure I understand why ranges are the way they are. The fact that there's a new std::range::Range suggests that there's trade-offs, and the original choices are now considered suboptimal. As an educational matter, I'd like to understand better - is there a good discussion of why the new ranges were added anywhere (that would be understandable for a relative beginner like me)?

One thing I don't really follow:

Why is that? Is it simply because:

  1. Ranges don't store their elements.
  2. The element type could be complex (specifically it might not implement Copy), meaning that "create a temporary value and hand it out" isn't viable.

And how does that affect iterability?

Grr. I keep thinking at too high a level (Python hides memory issues like this from you) but getting confused when I try to drop to a low level (C doesn't have abstract concepts like iterators or containers). There's this "uncanny valley" feeling where Rust concepts look familiar, but actually aren't.

Thanks for your patience, and I'm sorry for the dumb questions.

This one. 0..1000000 is

Range { start: 0, end: 1000000 }

and there is no 294842 value for a reference to point to.

Thanks. I think I get that. But what I don't fully understand is the ownership requirements of an iterator. If I iterate over a range, the range could in theory create each element on the fly, and give ownership to me. The (only?) problem with that is the efficiency cost of creating elements as needed. Is that really the issue here? And if so, couldn't there be a way of saying "I don't mind the cost of creating the elements[1] so make them and iterate that way"?

I'm starting to feel that I should implement my own Range type that has the semantics I want. But that feels like I've probably missed something. Maybe the answer is to try, and then I'll work out what I'm missing in the process :slightly_smiling_face:


  1. For integers, and anything that implements Copy, the cost is by definition negligible ↩︎

The issue is actually very simple: the original std::ops ranges implemented Iterator directly. It then turned out that them also implementing Copy was a foot-gun: it was easy to accidentally advance a copy of the iterator instead of the original iterator, resulting in buggy algorithms. At the time of Rust 1.0, this problem was solved by having ranges not implement Copy.

Today, there is general agreement that this was a mistake, and having a copiable range data structure is more important than being able to use a range as an iterator directly. Therefore, the new std::range types implement Copy, implement IntoIterator, and don't implement Iterator.

(Another advantage is that RangeInclusive can have a better implementation when it doesn’t need to function as an iterator. std::ops::RangeInclusive has an extra boolean field that is only used for getting the correct iteration behavior, but std::range::RangeInclusive is a struct with only two fields.)

Even if the element type implemented Copy, an iterator cannot hand out a reference to a temporary value. Who would own that temporary value, and how long does it live? Remember, every iterator is collect()able, so if a range iterator worked this way, you could collect() it into a Vec<&i32>. This implies that there is, somewhere, storage for every element of the range. Doing that implicitly would be an unacceptable cost, since the storage would have to exist regardless of whether or not the iterator was collected. (In practice, it would also be nearly impossible to make that storage come to exist at the right time and live long enough.)

Creating each element on the fly, and giving you ownership, is what the range iterators we have actually do. But if you demand that the iterator give you references, then, by definition of what a reference is, you don't have ownership. So who does?

This would be an excellent exercise, as long as you’re prepared to carefully think about borrow checker errors!