Propagation of Iterator Kind


#1

In D, the following expression

x.map!(_ => _+1)

is random access iff x is.

Does Rust propagate such information at compile-time aswell?


#2

It does, via explicit trait implementations that are conditional on the map’s base iterator.

.map()
in particular has these conditionally: FusedIterator, DoubleEndedIterator, TrustedLen, ExactSizeIterator


#3

This is the kind of D code @nordlow is referring to (here r is not an array, it’s a lazy “finite random-access range”):

void main() {
    import std.stdio, std.algorithm;
    auto arr = [10, 20, 30, 40, 50, 60];
    auto r = arr.map!(x => x + 1);
    writeln(r[2]); // Prints: 31
    writeln(r[1 .. 3]); // Prints: [21, 31]
}

In Rust you can’t do it:

fn main() {
    let arr = [10, 20, 30, 40, 50, 60];
    let r = arr.iter().map(|&x| x + 1);
    println!("{}", r[2]);
    println!("{}", r[1 .. 3]);
}

It gives:

error: cannot index a value of type `std::iter::Map<std::slice::Iter<'_, {integer}>, [closure@...\test.rs:3:28: 3:38]>`
 --> ...\test.rs:4:20
  |
4 |     println!("{}", r[2]);
  |                    ^^^^

error: cannot index a value of type `std::iter::Map<std::slice::Iter<'_, {integer}>, [closure@...\test.rs:3:28: 3:38]>`
 --> ...\test.rs:5:20
  |
5 |     println!("{}", r[1 .. 3]);
  |                    ^^^^^^^^^

error: aborting due to 2 previous errors

#4

Could you explain what is the evaluation order of the D version?

Is f applied eagerly and creates a second collection of the same type? Or is it applied every time you access an element? Or is it applied on the first access to an element, but not on the subsequent ones?


#5

It was a question about propagating “iterator kind”. The original iterator is not indexable, and the map is not indexable. I listed the traits that are propagated.


#6

Yes, this is the case. map is fully lazy in D.

If you want an front-element-caching lazy variant use x.cache.map or for a fully non-lazy variant use x.map.array.

Note that there are more interesting applications for this sort of propagation.

See for instance an upcoming improvement to D’s merge:


#7

I don’t believe this (edit: referring specifically to random access iteration) could feasibly be fitted into Rust’s currently existing iterator model; Rust unashamedly allows map closures to have side-effects, and such side-effectful closures have of course up until now been written with the expectation that they are invoked on items in order. (edit: this particular example is wrong, because map propagates DoubleEndedIterator) Also, many iterators have had their item types written with the expectation that an item only needs to be yielded once (so they produce owned values where an index adapter might not be so able).

That said, a collection of “Index adapter” types sounds like an exiting idea! One caveat, though: they wouldn’t be able to use x[i] notation since Index can only produce references.


#8

Is this one of the few design mistakes of Rust 1.x?


#9

It would be a mistake if the design disallows fixing it without breaking changes, I really hope that’s not the case. (For example, the by value IndexGet trait, would it work? Also many iterators can be and even are indexable by reference, though none in libstd implement it).

Another worry is how to have efficient random access in Rust, while still being memory safe.


#10

The signature of Index cannot change, but conceivably there could be other traits which also define the indexing syntax with different signatures. I’d probably want mutual exclusion before adding something like that.

It’s not obvious to me that the signature of Index is a mistake. In particular, the current signature is the only way I know to provide the contract between Index and IndexMut that their indexes return the same type, by ref or by mutable ref.


#11

Even ignoring the propogation issue, the iter returned by vec.iter() seems like it oughtn’t implement Index because as you iterator through it, the 0th element will change & so it won’t be a clear copy of the indexing of the underlying collection.


#12

I’m not sure, I think indexable iterators are useful. In the long run, how to do random access iterators needs to be explored fully in Rust. (I think ATC will be important in formulating traits for splitting and iterating sequences).

A different argument against indexing iterators is that the slice’s representation (raw pointer and length) lends itself to simple bounds checks that often optimize out, and the iterator’s representation (start, end raw pointers) lends itself well to iteration but needs to compute the length derived from the pointers for bounds checks.


#13

If we had generic splitting iterators, we could implement rayon’s IntoParallelIterator for them!


#14

On the more general topic of “propagation of kinds” in general, looking at the above I was suprised to find that D also propagates “infiniteness” of iterators! I’m a sucker for attention to detail like this, so I did a survey of D’s std.range.primitives to see what else they had over us; however, in the end, isRandomAccessRange, hasSlicing, and isInfinite are the only ones that stood out:

             Template -- Relevant capabilities*       Rust approximation**

         isInputRange -- front, popFront, empty      (Iterator)
        isOutputRange -- put  (more like 'push')     (FnMut)***
       isForwardRange -- save                        (Iterator+Clone)
 isBidirectionalRange -- back, popBack               (DoubleEndedIterator)
  isRandomAccessRange -- x[i]                         ---
    hasMobileElements -- moveFront, moveBack          N/A
 hasSwappableElements -- swap(e1, e2)                 N/A
hasAssignableElements -- e1 = value                   N/A
    hasLvalueElements -- &e1                          N/A
            hasLength -- length                      (ExactSizeIterator)
           isInfinite -- 'empty' is statically false  ---
           hasSlicing -- x[i..j]                      ---

  *: 'e1' and 'e2' are expressions `x.front`, `x.back`, or `x[i]`
  **: "N/A" = wouldn't make sense in rust.
      "---" = missing from rust.
  ***: I am only half-joking on that one.

This inspires some ideas, but I’ll also be playing devil’s advocate, because on reflection I don’t think any of the ideas are actually good!

Idea number 1: PeekableIterator
There is a disparity between our Iterator and D’s ForwardRange interface (which includes a non-consuming front()). This briefly brings to mind the following possibility:

trait PeekableIterator: Iterator {
    fn peek(&mut self) -> Option<&Self::Item>;
}

…which is quickly tossed in the rubbish as one realizes any of the following.

  • The differences between iter.peekable().map(f).peek() and iter.map(f).peekable().peek() would be nuanced to the point of insanity.
  • One can always just create a peekable() right where peek() is needed. The only disadvantage is that you might end up with multiple layers of Peekable. Even then, that is often desirable over the alternative of repeating function calls.
  • Would there be a PeekableDoubleEndedIterator with a peek_back()? What happens to rev if there is? (or worse, if there isn’t?)

Idea number 2: InfiniteIterator

Information on the purpose of isInfinite seems scarce, but judging from this newsgroup thread it seems the point is to help eliminate bounds checks. In theory, rust could have

trait InfiniteIterator: Iterator {
    #[inline]
    fn nooxt(&mut self) -> Self::Item { self.next().unwrap() }
}

With this default implementation of nooxt there is no possible advantage over next, and it continues to depend entirely on the compiler’s inlining decisions and dead code detection. However, if there is an adapter which frequently presents trouble for optimization, then nooxt could be given a hand-optimized implementation.

And therein lies the catch. It appears that InfiniteIterator would be a far greater maintenence burden in Rust than it is in D, as almost certainly, any implementation of nooxt would have to be a near duplicate of the implementation of next (just without the Option logic)!

And the real bit of irony; I don’t think we can properly impl InfiniteIterator for Chain with the current rules on impl overlap.

So there’s my ideas; they all suck. Anybody have some that don’t? :slight_smile:


#15

If you have a loop on an isInfinite range, you know it’s equivalent to a Rust loop{}, so the compiler can do some optimizations and can catch some bugs, like calling the Rust iterable.count() method (walkLength in D).

Is HKT able to improve this problem?


#16

I don’t see a connection to HKT here, do you have something particular in mind?

My impression is the primary difference is between D and Rust here is the trade off between internal and external iterators.


#17

I think I can kind of picture a connection to HKT; suppose the definition was instead:

// Call it "an Option with no None variant",
//   or "the Identity monad", or whatever you will.
type Always<T>(pub T);

trait InfiniteIterator: Iterator {
    #[inline]
    fn nooxt(&mut self) -> Always<Self::Item> {
        Always(self.next().unwrap())
    }
}

There’s no doubt Always could implement all of our favorite Option methods like map and and_then. Supposing that HKT existed and there were traits available to abstract over these types, I imagine one can probably piece together a way so that the code only needs to be written once.

Of course, as long as we are only dealing with a closed set of types (Option and Always), a macro is probably a far more economical solution… But now that the gears are turning, I almost have to wonder… are InfiniteIterator and Iterator perhaps just instances of an even wilder, more general concept of iteration?


#18

I think coherence is the actual issue. We would like to provide this impl:

impl<T: InfiniteIterator> Iterator for T {
    type Item = <T as InfiniteIterator>::Item;
    fn next(&mut self) -> Option<T::Item> {
        Some(self.nooxt())
    }
}

Only issue is its potential for overlap with other impls

This would also of course move where you define the iterator behavior for infinite iterators.


#19

We would like to provide this impl:

Ah! At first I agreed, but this is not really true for conditional impls. (e.g. the above would implement Iterator for infinite Zips, but not all Zips)


#20

Right, that’s where the coherence issues comes in.

If we had the ‘intersection impls’ specialization, we could provide an identical impl to this one for every ‘conditional’ impl, as in impl<T: InfiniteIterator> Iterator for Zip<T> and so on, in addition to the impl<T> Iterator for Zip<T>