Does rust really need higher kinded types?

I fundamentally disagree with this characterization, so it's hard for me to follow your motivation. My suspicion is that this puts us at an impasse, and that's OK. It happens. (I note that "disagree" does not mean "Rust is a panacea.")

I think I understand your proposal as much as I can at this level.

Are you claiming there is no tsuris at all compared to simply calling a function? Are you claiming that if don't encourage encapsulation of resource lifetime management we not likely to infect all the libraries and user-land code with this increased tsuris?

Did you see the following edit:

Also why not provide the opportunity for increased performance. Seems like a no brainer decision to me.

To the first question: I can't ever recall experiencing tsuris while using iterators in Rust.

To the second question: I'm not claiming that, but that doesn't mean I think your claim is true. Where's the evidence? Is this happening today? What code is impacted? Any crates in particular?

I have no counter-argument except the woeful one that I expect in retrospect, if we don't encapsulate the resource lifetimes mgmt as much as possible. And that is my experience versus your experience. We will see... :slight_smile:

Btw, I offer the evidence of all the problematic questions on this forum about lifetime management. And this is teaching the early adopter super-experts. Just wait until the junior programmers come in.

And not encapsulating means lower performance in some cases. I thought Rust's claim was maximum performance and safety, not just safety.

Right. Impasse. It's cool. :slight_smile:

In the section The Basics of Explicit Lifetimes and Lifetime Parameters, if we don't define a lifetime annotation for the fn get_x then we get a compile-time error because x references a resource that has gone out-of-scope and thus is no longer guaranteed to be allocated. No code is marked unsafe.

My point is that either Rust catches all such possibilities (which means there is no unsafe code in the entire program which is unlikely), but this can cause several forms of coding gridlock, such as chasing down the source of and fixing the error if it is nested or not local and the gridlock you admit of for example getting caught in a circular reference where need two mutable references.

Or alternatively, we can't entirely trust the Rust resource lifetime checking because unsafe code will forever be a reality of programming, thus my point that encapsulation is best when possible. So this problem doesn't infect everything.

If by "remember to enforce" you mean "deal with", then sure, having to think about lifetimes is a fundamental negative consequence of Rust's borrow checking system. I interpreted the phrase as meaning that you could forget to enforce it and accidentally write unsound code.

Can you elaborate what specific cases of superior performance with enumerables you think apply to Rust? As I said upthread, the particular claim of his I read regarding remote databases made little sense at all, and most of the rest of the paper was fairly specific to immutable GCed languages or Scheme in particular.

Afaics, you are oversimplifying and/or maybe misunderstanding the issue, which I understand to be that the imperative algorithm may need to know in advance whether there is another iteration in the queue before actually consuming it. For example, lets say the collection is an unbuffered stream (for performance reasons) over a remote high latency connection.

There is often error when we prematurely specialize our analysis.

I'm pretty sure Oleg wasn't referring to anything of the sort - at least in the paragraph I quoted. Also, while what you said may be a concern in some cases, it seems to me that it broadly argues for iterators providing a 'pull' API (i.e. cursors) over 'push', not the other way around.

I've heard that Oleg is very smart guy. I'd be careful with assuming that Oleg's mental model isn't quite more complex than he is going to write down in an informal document that isn't a published research paper. The numerous examples of people misinterpreting my posts because my mental model backing my comments is more informational than I can lay down in one bang on the keyboard.

Any way what Oleg intended is irrelevant. Only we need to analyze the subsequent elucidations that have been stated.

I am claiming that even enumerable and iterator won't model every interface needed.

Impossible that LLVM can optimize all cases of the bound check. LLVM can't decipher imperative spaghetti to determine the bounds check is orthogonal. You are fantasizing about Super Optimizers, which can never exist.

I believe you entirely misrepresented his point. I believe he is pointing out that a suitably sophisticated DSL could convert that functional expression into the appropriate SQL query, entirely eliminating the enumeration. It is a special case, but I think he is making a general abstract point that enumerables are more informational to the data provider's (aka collection's) algorithm than iterators are.

"Super optimizer" is a term for programs that use brute force to find equivalent code that's faster, and those programs do exist.

Someone needs to write an RFC for HKTs. I don't have the experience to do it, but I imagine others do.

There will never be such thing as a 100% complete super optimizer, because the entropy of a Turing machine is unbounded.

[Moderator note: One comment removed. Please everyone try to avoid personal remarks, and let's also try to keep this thread from spiraling off into arguments about tangential topics.]

Just for the record, experience appears to confirm this – LLVM can optimize simple loops (which are the majority of hot loops out there, so the loss isn't as high as one would think), but at some point capitulates in the face of complexity.

This should not be surprising, as optimisation is equivalent to a proof search. You have the space of all possible algorithms, and the optimiser can apply code-to-code transformations that maintain the meaning of the code. An example of this kind of transformation is De Morgen's theorem on Boolean variables ~(a /\ b) == (~a) / (~b). So given any expression we have a library of abstract-syntax-tree transforms of which we can apply any transform, and multiple transforms in any order. We cannot predict in advance which transforms may lead to better code, because just like in a mathematical proof, you may need to expand the terms before you can simplify. So although it may be theoretically possible for an optimiser to convert a bubble sort into a quicksort step-by-step, effectively proving the equivalence of the algorithms by applying a sequence of AST to AST transformations each of which preserve the meaning of the algorithm, it will never happen in reality because of the combinatorial explosion.

This leads me to the conclusion that libraries of alternative algorithms that can be selected is the correct approach. The libraries can be written by domain experts that know the optimisations for a given domain (say matrix multiplication) and should allow the experts to overrule the compiler safety. Ideally there should be some obligation on the library writers to prove their code safe as the API level, and I would like to see the type system expanded to make these proofs possible, so that impure code inside the library could be proved to behave like pure code (have value semantics) to the user of the module. This means code inside the libraries would me imperative by nature and deal with detailed resource management, but outside it would behave like a simple stack-memory only functional language (so no need for lifetime management in user code).

Returning to the topic, I'm fairly new to Rust, but have worked with type-systems for a while, I am coming to the conclusion that HKT are not going to be a problem, although how to deal with partial application of types may be a problem (note currying is specifically representing a multi-argument function as multiple single argument applications each of which returns a further function, whereas not applying all the arguments to any (possibly curried) function is partial application). I think if you have type inference, adding complexity to the type system to make the value-level code simpler is okay, as you should not have to think about the type system to write simple programs. I still think higher-ranked-types are a problem and will break monomorphisation, but I don't see the same issues for HKT.


A user of iterator has to work with Option and handle two cases: Some(x) and None. This is not as bad as EOF, but it's still something special that the user has to do. In contrast, a user of enumerator just works with x without any additional entities involved.

Overall, what Oleg wants to achieve - and the point that you seem to be missing - is a unified interface that works well in the most situations. He argues that iterators are more difficult to implement, introduce more boiler plate on both the caller's and implementation's side, are less efficient in common usage cases and are more error-prone. Not all of these points are applicable to modern collections APIs (as they are improving) and to Rust in particular, but some still are:

  1. Rust lacks a common interface equally usable over any kind of collection, irrespective of its implementation and storage medium. Iterator is a bad interface to a collection stored in a remote database. You proceed to argue that a good enumerator interface for this case is impractical and no statically typed language implements such an interface. While I don't have a first-hand experience, I hear this is exactly what LINQ is (or rather what it could be if C# had compatible paradigms in its standard library; still, it's pretty close). Note that Oleg's article predates LINQ by 5 years (IIRC).
  2. Rust's collections are quite complicated and present a formidable barrier to entry. I had a really hard time understanding the basic usage patterns, as evident when looking at my SO question. Note the mess of types that I have to deal with in that post and how I fail to trace how types propagate through the system. I can not quantify how much of the complexity stems from the lifetime/borrow semantics and how much can be gained by using enumerators and/or HKTs, but Rust's collections definitely have a big problem in my view.
1 Like

Actually, I've been wanting to discuss the complexity of Rust's collections for a long time. Let us examine the number of types involved in the process of performing stream operations on a Vec:

  • Vec::iter() gives a std::slice::Iter

  • Vec::iter_mut() gives a std::slice::IterMut

  • Vec::into_iter() gives a std::vec::IntoIter (note it's in a different module than the other two)

  • Iterator<Item=&T> is implemented for Iter

  • Iterator<Item=&mut T> is implemented for IterMut

  • Iterator<Item=T> is implemented for IntoIter

  • Iter/IterMut/IntoIter::map() give std::iter::Map with Iter's Item type

  • Map::filter gives a Filter

  • etc.

For each enumerator function there is a struct implementing the enormous Iterator interface. While in the source code the implementations have a modest size, only implementing (overriding?) a couple of functions, the documentation does not expose this fact - when looking at the API docs, it looks like every struct has a full-blown Iterator impl.

By the way, most of the source code (but not all of it) seems to be mostly contained in src/libcore/, which is almost 5k lines long and has more structs and traits defined than I can count. Which makes navigation and orientation in that source code quite hard (would be easier with a decent IDE, but I am yet to encounter one).

Recalling the name of the topic: can the number of types and general complexity be decreased by using HKTs?