? #include <algorithm>

Hi Rust Community!

Why didn’t you just “copy” the “#include <algorithm>”?

I purposefully wrote the question like that, because, as web-search-results have shown, many think of the STL as just the containers, sadly.

I found some similar algorithms in Iter and Slice, but partition for example is not in-place, and iterators do not have the input-forward-bidir-randomaccess hierarchy, … lots of differences.

I suspect these differences are for the sake of safety, but I do not understand how, and I am not convinced it is worth it.

The STL is really thought out design, probably the best library ever. Why didn’t you guys just adopt its design?

Could you explain or point to some explanation.

Thanks.

C++ iterator invalidation is probably the biggest flaw in that design, which Rust solves. C++ iterators are really cursors, and there’s no tracking of ownership or protecting immutability. Rust iterators are more like ranges, which in part enables Rust to also track borrowing and protect that memory.

I think an in-place partition could be possible to implement for DoubleEndedIterator<Item = &mut T> – I might experiment with this. I’m less sure about emulating C++17’s stable_partition, but AIUI C++ also uses extra allocation for that.

We have traits that do this sort of hierarchy, Iterator -> DoubleEndedIterator -> ExactSizedIterator
It is different, but as @cuviper said, this is because Rust iterators are more like ranges than cursors.

Also, Rust iterators are very powerful, just look at the docs, Iterator there are 57 functions that do stuff on all iterators and 28 combinators to build up complicated iterators form rather simple primitive ones. All of this for just implementing a single associated type and a single function.

Moreover you can increase the performance you your iterators by implementing the size_hint for more a efficient collect and nth for more efficient skip iterator, and finally fold for more efficient iteration over the rest of your iterator. So with a grand total of 4 functions you can get a super efficient iterator with a whole host of combinators and other functionality.

There are even some simple functions like successors and repeat_with and my personal favorite, from_fn that allow you to make ad-hoc iterators very easily.

I would say that the single best designed interface in Rust is the Iterator trait, nothing else comes close to the power of Iterator for it’s cost (at least a single type and associated type).

While C++ may also have powerful functionality it is harder to optimize because it is all free template functions (where Rust’s are easier to optimize by just implementing the right functions) and it is less safe.


edit:

I looked into C++ algorithms a bit more, and noticed that many functions had a _n variant. This looks like a lot of code duplication, which isn’t a sign of good design. With Rust all you would need to do to limit the size of you iterator is to do iter.take(n) , and the iterator now will yield at most n elements.

Similarly for with

  • _if, that would just be iter.filter(predicate)
  • _copy would be iter.cloned() or iter.copied()
10 Likes

I programmed many years in modern C++, always trying to get the best from the last standard (changing over time).
I can say that on my last (and biggest) project I literally used everything I could from the design of the C++ iterators… And it didn’t end very well. I needed the ranges-v3 from Eric Niebler because of some limitations of the STL. It is worth noting that the C++20 will have a basic ranges library, superseding the “old” iterator-based approaches and embracing the same concepts (not the C++20 Concepts :grin:) used in Rust.

Give a look to the ranges proposal, you will find that it is easier to perform the same operations in Rust, mainly because C++ evolved over time and it (unfortunately) does not fit so nicely into the functional programming, syntactically speaking.

3 Likes

Oh wow, I’d somehow missed that. This is going to make my life so much easier.

Is there a good tutorial or guide that explores the various iterators and how to use them?

1 Like

The heart of the matter here is:

  • Everything in C++'s algorithm performs internal iteration. That is to say, they all have the for loop inside the function.
  • Rust’s Iterator API tries to provide a lot of good tools for external iteration. The for loop is intended to exist either in the caller, or in one of very few internal iteration methods. (basically, fold)

The ironic part is, C++ has external iterators (its concept of an iterator is an external iterator!). But for some reason it gives them no love!

There’s a number of advantages to external iteration:

1. They compose well.

Let’s say you want to know the total cost of all events that happen on a Tuesday:

events.iter()
    .filter(|event| event.is_on_day(Weekday::Tuesday))
    .map(|event| event.cost)
    .sum()

This runs in a single for loop without having to allocate vectors for intermediate results (or worse, having to name all the temporary variables!).

In C++, every container type needs to come with its own reverse iterator. In Rust, you can just do .rev(), oftentimes even after doing other things!

// turn [A, B, C, D, E]
// into [(4, A), (3, B), (2, C), (1, D), (0, E)]
things.rev().enumerate().rev()

2. They can be infinite/don’t need to fit in memory

C++'s std::iota “fills a range with successive increments of the starting value.”

Our equivalent of that in Rust is simply 0.. . You could implement enumerate in terms of this:

fn enumerate(iter: impl Iterator<Item=T>) -> impl Iterator<Item=(usize, T)> {
    (0..).zip(iter)
}

3. They are lazy. (think short-circuiting!)

strs.map(|s| s.parse::<i64>())
    .collect::<Result<Vec<_>, _>>()

This attempts to parse each string in an iterator, stopping immediately after the first error. Of course, C++ can do that with exceptions, but exceptions bring many problems of their own; this shows how we don’t need them in Rust.

4. After optimization, they’re just as fast as external iteration.

Usually.

Sometimes when writing your own Iterator type, you may need to override fold (basically the fundamental “internal iteration” method) in order to help the compiler see work that can be lifted out of the loop.

So what are the downsides? Well:

  • They require more compile time.
  • You can end up with some crazy-looking or difficult-to-name types.

P.S. here’s one of algorithm's golden nuggets of design that I ran into while trying to write this post:

  • find - finds the first occurrence of an element
  • find_end - finds the last occurrence of a subsequence
  • search - finds the first occurrence of a subsequence
9 Likes
// turn [A, B, C, D, E]
// into [(A, 4), (B, 3), (C, 2), (D, 1), (E, 0)]
things.rev().enumerate().rev()

Just a nitpick but wouldn’t it be:

[(4, A), (3, B), (2, C), (1, D), (0, E)]
1 Like

FWIW, here’s a partitioning PR:

1 Like