Most coveted Rust features

I am not sure I understand why the iterator returned would every be something other than a concrete type. You have to return some type right? Rust already has universal quantification for lifetime parameters, but perhaps I am overlooking something.

Regarding functor map, i don't think this is a substitute for iterators, although Rust has a very limited definition of iterators, that does not seem to include bidirectional, indexed, random, bifurcating and all the other coordinate structures found in Stepanov's "Elements of Programming". Personally I think iterators are a much more important concept than a map (which is actually an algorithm that can be implemented using an iterator). The idea of iterators as the abstract algebra of containers comes from the analysing the access patters used in different algorithms and then grouping them. Iterators naturally form a hierarchy so a container only every needs to provide one, but that may implement several traits. Algorithms that operate on containers then use the traits.

I have been implementing the Iterators from chapter 6 of "Elements of Programming" in Rust, although I am not completely happy yet the code is in GitHub if anyone it interested, as I think I have got to the point where I can see it is going to be possible to implement all the algorithms: https://github.com/keean/elements_in_rust

As for Rust features I would like:

  • a way to generalise over move/reference passing
  • introduction of write-only references (several write-only borrows can exist at the same time).
  • separate read-only, and write-only dereference operators, in addition to full mutable deref which should only be usable on readable and writable references.
  • traits to generalise all this (Readable, Writable and Mutable traits).
  • a dereference operator that can dereference a non-reference value (is the identity function for non-references).

I don't think any of those would count as popular requests, but the next one may be:

  • negative trait bounds.

That, well, would not work. The iterator type needs to contain the lifetime of iter (say, slice::Iter<'a, T>). Your example has only a single fixed Iterator type.

If we were willing to use trait objects, this problem would be solvable:

trait Collection {
   fn iter<'a>(&self) -&gt; Box<Iterator<Item=Self::Item>+'a>;
}

Could you clarify what you mean by premature specialization?

That is, of course, not an option in Rust. Laziness leads to very unpredictable memory consumption, which we really do not want. Plus, functors require higher-kinded types already, and if we had HKT, we would have working iterators.

So that's your problem. You really seem to not understand how Rust works - go and play with it.

Please don't go to ad hominem assumptions. Stay on the facts of the issues please (as you have been other than this quoted comment). The discussion of the facts will reveal our respective understandings. Thank you. I do understand the inclination to make that assumption, but please let the facts reveal everything. It will be obvious enough by the time we finish this analysis of the facts.

Did you not see the & on the return type in my example? I was intending it to return a trait object, which in my hypothetical idea is the conjunction of which ever types the caller wants for T in addition to Iterator. I've read that Box or & can signify a trait object. Perhaps Box is needed for some reason as you've shown in your new example.

I was leaving the issue of resource lifetimes as an orthogonal issue. I was only addressing the composability of letting the caller specify which traits the caller wants the return value's implementation to have, i.e. inverting the control over the typing and thus not prematuring specifying the return type.

The caller may need the returned Iterator to implement some other traits as well.

Please understand that I am talking about a different paradigm. So some of the words I am writing may not immediately make sense to you, if you are not understanding the paradigm I am promulgating.

More replies to follow after a few hours. I am in a rush to head outside.

How can the caller control the return type? The return type is a pure function of the type of the collection and the relevant lifetime. In other words, why can't the caller pick u32 as the return type?

According the documentation for associated types, the caller can specify the return type he wants with the type Collection<T=Type>. The caller may not be holding the type of the struct for the implementation of the Collection trait and may only have requested the aforementioned trait object from the caller's caller.

You appear to be thinking about this from the perspective of one concrete example you want to apply it to, in this case your notion of how a collection should be structured (at least in current Rust memes, including any low-level requirement to model lifetimes which may not be necessary in higher-level models that conquer the same pitfalls). I am coming at this more abtractly in terms of maximizing degrees-of-freedom and thus composability, so that we would be free to express any conceivable structure for a collections library. Yes the caller could in theory specify a u32 return type if there are any implementations available. The reason I had the conjunction &(Iterator ∧ T) in my pseudocode/hypothetical (not currently valid in Rust) example is to require the caller will always get at least an Iterator trait object in return. The caller is also free to request other trait(s) T that the caller requires, assuming implementations will be available in the caller's caller's scope.

The point is that I have proposed inverting the control over who decides the return type, instead of hardcoding it as a higher-kinded type that requires the type is (or a member of the) the type of the implementation of the trait, I have parametrized the trait on the return type for more degrees-of-freedom. The implementing type Self might be able to implement the trait more than one way, so we gain degrees-of-freedom by not prematuring hardcoding structure that requires the return type to be the implementing type (or one of its named associated types, e.g. Self::Iterator).

If I am not mistaken, a higher-kinded type is parametrization of a type parameter. The reason your first example is higher-kinded is because you require that the type parameter Iterator's lower bound Self::Iterator<'a> is a member of the implementating type Self. Thus you have parametrized the type parameter Self with the type parameter Iterator. If you had not required that Iterator to have a lower bound incorporating Self, then your example would not be higher-kinded. I make this distinction to point out that afaics higher-kinded types are premature specialization because they bind the abstract trait interface with some assumptions about the implementation structure of the implementing type. I had proposed instead that the type of Iterator be a free (i.e. unbound to any implementation) trait that is a declared interface required for iteration.

arielb1, sincerely thank you for engaging me on these ideas. Peer review is very helpful, as well your apparent expertise with Rust's syntax and semantics. This also helps me to formulate a more complete conceptualization and explanation.

I am promulgating my idea from the perspective of what might be hypothetically possible in Rust (or any similar programming language), since this thread is about coveted future features that are requested to be added to Rust. Our discussion started with myself asking for examples of use cases that require higher-kinded types.

Functional composition of operations over collections can be performed without laziness (and pure function memoization), but loses the runtime deforestion optimization which consumes more memory and is slower. There are other potential methods of performing deforestation. All of those can and should be allowed, because programmers want the flexibility to choose the best paradigm model for each scenario.

Let's review why functors require higher-kinded types. Seems you are requiring something like the following (apologies if the Rust syntax isn't correct):

trait Functor {
   type T;
   fn map<A>(&self, T -> A) -> Self<T=A>;
}

We might contemplate to do instead the following which is not higher-kinded:

trait Functor {
   type T;
   fn map<A>(&self, T -> A) -> &Functor<T=A>;
}

So in other words, the trait object would maintain the runtime knowledge about what the implementing Self type is, which is not checked at compile time.

However, you are correct on this point, because the type signature of the alternative is not constrained to a functor.map operation, because there is no requirement that the return value have the same Self type. For example, the implementation could employ monoid.append to return a Functor which has a different Self type, e.g. convert from a Vec<T> to List<A>, which would violate the invariants of the category theory for a functor.

So thus you have pointed out an example of where we must have higher-kinded types.

I think we have shown that higher-kinded types are required when either we must restrict a trait type to a parametrization of the Self type for reasons of conformance to invariants (e.g. category theory for functors) or because we wish to forsake some generality in order to get more optimization (e.g. not using a trait object for an iterator, but where it is possible to get similar optimization using functors but these require higher-kinded types also and we might gain some other higher-level functional composition abstraction and control over resource management to avoid the low-level imperative approach).

We basically did something like that in pre-1.0 Rust. The problem was that every generic function had to take half-a-dozen type parameters describing the iterators that it wanted to use. That's why we introduced associated types.

BTW, you do need something higher-ranked here: if you call a generic function, that generic function needs to be able to create an iterator for every lifetime it comes up with, and this requires higher-ranked things (at least a higher-ranked trait bound).

That's what I was trying to get to with u32: if I write a generic function

fn fractalize<'a, T: Collection>(t: &'a T) {
    let iterator = t.iter();
    // ...
}

I need to pick a generic type for the local iterator. However, the caller of fractalize can use any collection that they want to, so we must have some sort of bound.

If we don't want to add an additional type parameter to every caller, we need an associated type.

Why not the following as I proposed?

fn fractalize<'a, A: Collection<T=FooeyIterator /*∧ (any other traits required)*/>>(x: &'a A) {
    let iterator = x.iter();
    // ...
}

Isn't the bound specified in my example above?

The bound is specified by the callee in my example above. That is the point I have been making that I wanted to invert the control over who specifies the type of iterator returned.

But fractalize itself can't know which iterator to use - if you call fractalize::<Vec>, you pass it the Vec iterator. If you call fractalize::<HashMap>, you pass it the HashMap iterator. If the HashMap itself is generic, you need to pass it through the chain.

Yep that is what I thought and it doesn't need to. Which is why I had &(Iterator ∧ T) as the return type. Thus fractalize would only declare the type of T to be Iterator if it doesn't require any additional traits on the iterator. The Iterator is a trait that has the general method(s) of an iterator, e.g. next(). Why would fractalize need to know the specific type of the implementation of Iterator when I have written the return value as trait object?

This is an anti-pattern:

fn fractalize<'a, T: Collection>(t: &'a T) {
    let iterator = t.iter();
    // ...
}

If you want fractalize to be generic it should be defined on the iterator, not on the collection. Something like:

fn fractalize<I : Iterator>(first : I, last : I) {
     // ...
}

This is more generic as it can operate on any collection implementing an single pass forward iterator, and on any sub-range within that collection with no loss in efficiency. We might provide a different version of fractalize which is more efficient for collections that provide random access like this:

fn fractalize<I : RandomIterator>(first : I, last : I) {
     // ...
}

Algorithms can be grouped by their access pattern, and those classifications of access patterns are the different iterators (like forward, bidirectional, indexed, random, bifurcating etc). Collections provide iterators, algorithms use iterators. See Stepanov's "Elements of Programming" chapter 6 "Iterators".

But still we would need the trait for a Collection to be defined the way I suggested so the caller can require Collections that implement the conjunction of traits for the Iterator we require. That is if we are adding first-class conjunctions of type.

And I think yours is an anti-pattern as well, because iterators are not generally copyable[1], i.e. they are too low-level to capture (enforce/express) all the intended or implicit semantics.

Iterators are ostensibly bad in many ways...but that digresses into a functional programming versus imperative programming debate I think...perhaps sometimes you must have them...

[1] for fun, I answered that Stackoverflow question 2 hours ago, anticipating I would be able to use it here. Hehe.

There is no anti-pattern here, Iterators are the abstract algebra of collections. Read Stepanov's "Elements of Programming" and then tell me iterators are not beautiful :slight_smile: Iterators are to collections what semirings, rings, groups, monoids are to numbers. And the key point is it removes all the problems of associated types and how you get the iterators from the generic algorithm definitions, where they really don't belong.

The collection can just return the actual iterator, so calling collection.begin() returns an iterator pointing to the beginning of the collection. We don't care about its actual type, but this iterator will implement some traits, and we can call all the algorithms defined for the traits that iterator implements. So for example a single pass forward iterator only implements 'successor', a bidirectional implements 'predecessor'.

Actually, only the basic single pass forward iterator (Iterator) is uncopyable, a ForwardIterator or BidirectionalIterator permit multi-pass and should be copyable (well cloneable in rust, as I am not keen on auto-copying). Obviously algorithms that can be implemented in terms of single pass forward iterators should be implemented that way in the interests of making them as generic as possible, but there are also many algorithms that require ForwardIterators, BidirectionalIterators, RandomIterators etc. A particular favourite is the BifurcatingIterator, which is the iterator for binary-tree like access (there are two successor functions left_successor and right_successor).

I have spent many years programming Haskell, and many years thinking functional is better than imperative, but Stepanov's book changed my mind on that. A functional map is an algorithm and as such does not replace iterators, but should be implemented in terms of them. You would probably implement map something like:

fn map<I, O, J, F>(mut s : I, mut d : O, n : J, f : F)
where I : Iterator + Readable, O : Iterator + Writable, J : Integer, F : FnMut(&I::value_type) -> &O::value_type {
    // Precondition: readable_weak_range(s, n)
    // Precondition: writable_weak_range(d, n)
    while !n.zero() {
        let dst = d.sink(); // this is a work around Rust not allowing l-values to be returned
        *dst = f(s.source());
        n = n.predecessor();
        s = s.successor();
        d = d.successor();
    }
}

So whereas a map normally only works over lists, this one works over all single pass forward iterators (the uncopyable ones).

Edit: I should point out that I am using iterators as defined by Stepanov, not the Rust builtin trait.

Even with non-copyable iterators you can always use T where &T: IntoIterator. Edit: Or even better, use T where T: Copy + IntoIterator – this way callers can usually use &T (because most types that have an .iter() method also implement IntoIterator for their refs).

I think your quantifiers are in the wrong place. Could you write them more explicitly.

For example, my code

fn fractalize<T: Collection>(t: T) where for<'a> &'a T: IntoIterator<Item=&'a T::Item> {
    let iterator = (&t).iter();
    // ...
    frobnicate_collection(t);
}

has a type ∀T. (T: Collection) → (∀'α. &'α T: IntoIterator ∧ <&'α T as IntoIterator>::Item = &'α T::Item) → fn(T)

Here <&'α T as IntoIterator>::IntoIter is a lifetime-indexed family of types that is known to exist but is not specified. I don't see how you can do that with type parameters.

@llogiq

I was assuming that fractalize also does other things with the collection

Thanks for the tip, I just finished skimming Chapter 6 of "Elements of Programming".

I agree that iterators are an abstraction of the fundamental taxonomy of linear traversal data structures he mentioned: single-pass forward (input stream), multipass forward (linked list), bidirectional (doubly-linked list), and random access (array).

However, I'm contemplating they are an anti-pattern because they are so often premature generalization which renders the high-level algorithm opaque, thus grabbing more complexity than is required (e.g. low-level memory lifetimes) and less high-level optimization clarity than if we expressed the high-level algorithm more transparently.

Iterators are opaque semantics.

For example, we can employ a single-pass forward iterator to implement some, then the caller can compose the output with another the (reset) multipass forward iterator to find a specific iterated element in the output. But if we instead built these composable operations on the higher-level abstractions of functor.map, then we can glue them together with lazy evaluation such that we get deforestation so that we don't process the elements of some that occurred after the found element. We get this high-level optimization automatically under any sort of functional composition of forward traversal without needing to think about it. Also we don't need to remember to deallocate the iterators since the iteration occurs in the Functor.map library and not in our user land code. And our high-level semantics are more explicit and less opaque.

It doesn't replace iterators where there is no reusable high-level abstraction such as functor map, but it makes the premature generalization of iterators an anti-pattern when there is a high-level abstraction that is more closely fit to the actual algorithm.

I don't understand well lifetimes because I am not interested in that capability of Rust at this time and thus not putting the effort into learning everything about lifetime modeling. Thus unless you can reformulate your point without lifetimes, then I will not be willing at this time to go learn enough about lifetimes to understand the syntax you wrote.

I am also not very interested in iterators so i am not that interested to learn what IntoIterator is.

Right or wrong, those are my current priorites and thus limitations. Apologies. My interest is on the ad hoc polymorphism composability independent of use cases employing low-level lifetime modeling and algorithms expressed with low-level, imperative iterators.

I would go further and say with my experience of Haskell, lazyness is an anti-pattern for programming language design. The problem is performance is too unpredictable and magic. Programmers don't understand that a recursion will build a stack of thunks for later evaluation, they just don't have any intuition about the cost of operations. Yes it's great for when you need streams, but for most uses its terrible, and Haskell programs get littered with strictness annotations to try and get them to perform well in time and space. It rapidly gets to the point that the program had strictness annotations everywhere, except where the author explicitly wants lazyness. So the lazyness by default seems a mistake for Haskell. How about eager by default with lazyness annotations? Well this is better, but it complicates the type system, and leads to needing two copies of data structures eager and lazy ones, not great either. How to deal with lazyness then? The best solution I have found is to use co-routines and 'yield', so that 'some' yields it's results instead of returning them. However you don't need specific language support for this you just need to define some as an iterator itself, so that it provides a new 'successor' definition. So iterators provide a mechanism for abstracting lazyness in an eager language as well, they are a pretty general abstraction.

As for making an algorithm more general that is the point. The definition of generic programming is to write the algorithm in the most general setting possible with no loss in efficiency. This means the most code re-use, and as the algorithms get used so much it becomes in everyone's interest to read and understand them. Do you really find any of the algorithms in chapter 6 too complex? I personally find them much easier to read and understand than points-free functional notation, and some quick research I did suggests that most programmers, even functional enthusiasts find the imperative code easier to read and understand.

You should read elements of Programming from the beginning, but first watch this Elements of Programming - YouTube let's talk some more after you have watched the video.

Edit: The video is about the book, so its not really about Iterators directly, but explains why the book was written, gives a good insight into Stepanov's approach to programming, what the book is like to read, and how to approach reading the book.

1 Like

+1

Wow. I learned something new. Big thank you! On superficial reading, that seems to be an excellent, composable abstraction which solves both the uncontrollable implicit side-effects of laziness and the inefficiency of an eager functor.

So thus a functor is not a granular enough abstraction and thus premature specialization.

I was thinking that iterators are lower-level abstractions which render opaque the intent of the algorithms that operate on a functor, but once we incorporate the need to control the implicit side-effects of laziness while also correcting the inefficiency of an eager functor, then per your co-routines idea, I now see that the low-level generality is the only abstraction will solves all the algorithmic issues. So thus while a functor gives us some higher-level insight/transparency on the composability of the operations driving the result and shields us from managing the lifetimes of the iteration structures in user land code, it loses the granularity to control the laziness locally and optimize the efficiency of the eager locality. The co-routine can abstract the eagerness to laziness while retaining the locality between some and find without infecting the type system or collections with lazy versus eagerness. Orthogonal abstraction. Wow!

So although we would no longer need higher-kinds for functors, we might need them for iterator factories on the collection trait.

I concur that category theory coding is often too obtuse as compared to imperative constructions.

I had noticed that YouTube on my prior Google, but I don't have time for that 1+ hour video right now. I put it on my TODO list if I head down the direction of iterators, which you seem to have made a major change in my thinking on just now.