Does rust really need higher kinded types?

"Super optimizer" is a term for programs that use brute force to find equivalent code that's faster, and those programs do exist.

Someone needs to write an RFC for HKTs. I don't have the experience to do it, but I imagine others do.

There will never be such thing as a 100% complete super optimizer, because the entropy of a Turing machine is unbounded.

[Moderator note: One comment removed. Please everyone try to avoid personal remarks, and let's also try to keep this thread from spiraling off into arguments about tangential topics.]

Just for the record, experience appears to confirm this – LLVM can optimize simple loops (which are the majority of hot loops out there, so the loss isn't as high as one would think), but at some point capitulates in the face of complexity.

This should not be surprising, as optimisation is equivalent to a proof search. You have the space of all possible algorithms, and the optimiser can apply code-to-code transformations that maintain the meaning of the code. An example of this kind of transformation is De Morgen's theorem on Boolean variables ~(a /\ b) == (~a) / (~b). So given any expression we have a library of abstract-syntax-tree transforms of which we can apply any transform, and multiple transforms in any order. We cannot predict in advance which transforms may lead to better code, because just like in a mathematical proof, you may need to expand the terms before you can simplify. So although it may be theoretically possible for an optimiser to convert a bubble sort into a quicksort step-by-step, effectively proving the equivalence of the algorithms by applying a sequence of AST to AST transformations each of which preserve the meaning of the algorithm, it will never happen in reality because of the combinatorial explosion.

This leads me to the conclusion that libraries of alternative algorithms that can be selected is the correct approach. The libraries can be written by domain experts that know the optimisations for a given domain (say matrix multiplication) and should allow the experts to overrule the compiler safety. Ideally there should be some obligation on the library writers to prove their code safe as the API level, and I would like to see the type system expanded to make these proofs possible, so that impure code inside the library could be proved to behave like pure code (have value semantics) to the user of the module. This means code inside the libraries would me imperative by nature and deal with detailed resource management, but outside it would behave like a simple stack-memory only functional language (so no need for lifetime management in user code).

Returning to the topic, I'm fairly new to Rust, but have worked with type-systems for a while, I am coming to the conclusion that HKT are not going to be a problem, although how to deal with partial application of types may be a problem (note currying is specifically representing a multi-argument function as multiple single argument applications each of which returns a further function, whereas not applying all the arguments to any (possibly curried) function is partial application). I think if you have type inference, adding complexity to the type system to make the value-level code simpler is okay, as you should not have to think about the type system to write simple programs. I still think higher-ranked-types are a problem and will break monomorphisation, but I don't see the same issues for HKT.

2 Likes

A user of iterator has to work with Option and handle two cases: Some(x) and None. This is not as bad as EOF, but it's still something special that the user has to do. In contrast, a user of enumerator just works with x without any additional entities involved.

Overall, what Oleg wants to achieve - and the point that you seem to be missing - is a unified interface that works well in the most situations. He argues that iterators are more difficult to implement, introduce more boiler plate on both the caller's and implementation's side, are less efficient in common usage cases and are more error-prone. Not all of these points are applicable to modern collections APIs (as they are improving) and to Rust in particular, but some still are:

  1. Rust lacks a common interface equally usable over any kind of collection, irrespective of its implementation and storage medium. Iterator is a bad interface to a collection stored in a remote database. You proceed to argue that a good enumerator interface for this case is impractical and no statically typed language implements such an interface. While I don't have a first-hand experience, I hear this is exactly what LINQ is (or rather what it could be if C# had compatible paradigms in its standard library; still, it's pretty close). Note that Oleg's article predates LINQ by 5 years (IIRC).
  2. Rust's collections are quite complicated and present a formidable barrier to entry. I had a really hard time understanding the basic usage patterns, as evident when looking at my SO question. Note the mess of types that I have to deal with in that post and how I fail to trace how types propagate through the system. I can not quantify how much of the complexity stems from the lifetime/borrow semantics and how much can be gained by using enumerators and/or HKTs, but Rust's collections definitely have a big problem in my view.
1 Like

Actually, I've been wanting to discuss the complexity of Rust's collections for a long time. Let us examine the number of types involved in the process of performing stream operations on a Vec:

  • Vec::iter() gives a std::slice::Iter

  • Vec::iter_mut() gives a std::slice::IterMut

  • Vec::into_iter() gives a std::vec::IntoIter (note it's in a different module than the other two)

  • Iterator<Item=&T> is implemented for Iter

  • Iterator<Item=&mut T> is implemented for IterMut

  • Iterator<Item=T> is implemented for IntoIter

  • Iter/IterMut/IntoIter::map() give std::iter::Map with Iter's Item type

  • Map::filter gives a Filter

  • etc.

For each enumerator function there is a struct implementing the enormous Iterator interface. While in the source code the implementations have a modest size, only implementing (overriding?) a couple of functions, the documentation does not expose this fact - when looking at the API docs, it looks like every struct has a full-blown Iterator impl.

By the way, most of the source code (but not all of it) seems to be mostly contained in src/libcore/iter.rs, which is almost 5k lines long and has more structs and traits defined than I can count. Which makes navigation and orientation in that source code quite hard (would be easier with a decent IDE, but I am yet to encounter one).

Recalling the name of the topic: can the number of types and general complexity be decreased by using HKTs?

What do you think about the kind of iterators derived from Stepanov's "Elements of Programming" elements_in_rust/lib.rs at master · keean/elements_in_rust · GitHub ?

It separates readability from iterability and defines the iterator in a few lines. Of course its not safe as there is no bounds checking, but I am thinking about a way to check that an iterator pair forms bounded_range without having to check each access to the iterator. All we need to do is check the start and end iterator are in the same collection and start is between the beginning of the collection and end, and end is between start and the end of the collection once for most of these algorithms, which would impose very little performance cost as it would be checked outside of the loop, which is more efficient than returning and checking an option each time.

And so much the better, because you don't have to look at each of those structs to find out which methods it supports: they just all support Iterator. Some might additionally be DoubleEndedIterator, but that's basically it.

You don't really have to concern yourself with all these iterator implementors -- except if you implement your own iterators, of course. Otherwise they're just opaque types that implement the relevant trait interfaces. In the future, with impl Trait return values, they won't even show up explicitly in type signatures anymore.

Well, you can't really skip learning about implementors, as right now map() returns a Map, not an Iterator. You have to get over that part when learning Rust to understand what's important and what's not.

That's true. The book should treat this topic especially concerning iterators, since they are so commonplace. Right now I don't know if it does.

In my experience, the only time you really have to use the iterator's next() method directly (and thus dealing with the Option type) is when implementing iterator adapters.
Otherwise you just apply existing adapters and then use a for loop or something like fold or sum or collect to drive the iteration. That way, the Option is not visible and you only have to deal with the actual element types.

We were discussing naked iterators, not foreach loops or other sugar. But since you mentioned it, what Rust's and other collections APIs end up doing is dressing the iterators with enumerators; using them is considered good style.

@kirillkh The facts about Iterator you are raising as problems are all implementation details that in practice I do not have to care about. What you call sugar is very important to how iterator is used; I never deal with the intermediate types created by iterator processing directly, whether they are the iter structs like vec::IterMut or the adapter structs like iter::Map. I just do something like this:

# In place mutation with a for loop
for x in &mut vec {
   x *= 3;
}

# Creating a new collection with iterator adapters
let vec2 = vec.iter().map(|x| x * 3).collect::<Vec<_>>();

It's true that the plethora of types is no longer a problem to me after I acquired the intuition of how to use it. But it was a problem at one point, and that has to count for something. Also, the mere fact that Rust is not able to encapsulate those implementors that no one is interested in and present a single Iterator instead speaks against the language.

1 Like

Then let's get an implementation of impl Trait. We definitely want that, anyway, to be able to efficiently return closures.

I agree that it can be a problem for learning, but I don't think it inherently speaks against the language as much as it shows what Rust has prioritized: providing zero-cost abstractions and keeping the language as minimal as possible. Other languages provide internal iteration mechanisms which are both more expensive at runtime and push the complexity you're talking about into the definition of the language, instead of into the definition of the standard library. A good analogy is Rust's handling of concurrency: the language doesn't have any notion of concurrency, all of that is defined through standard library types and traits.

You're completely right that this has a trade off, I'm just saying it does have advantages. I think the downside can be mitigated by "impl Trait", though unfortunately the standard library wouldn't be able to use it to make these types private without breaking backward compatibility.

That clearly exemplifies you have experience writing a compiler because it is an excellent explanation in the context of the compiler of my terse abstract (generative essence) statement:


That seems to be in agreement with my trait abstraction proposal:


If you meant proving that code marked unsafe imperative code is safe, then that is not possible:

However if you meant trying to encapsulate Rust's "safe" lifetime tracking within the libraries, then seems you are in agreement with my trait abstraction proposal.


I presume you mean local type inference of "expressions" (i.e. not having to explicitly specify for example the types of a lambda function) and not global inference, since I assumed we both agree that global inference elides type annotation necessary to see the polymorphic type class (aka trait) bounds on a function.

Seems you have come to the similar conclusion I did upthread wherein I argued that HKT are fundamental and we need them:


This thought occurred to me yesterday when I was designing a parser grammar and I realized I needed to make a syntax decision about curried functions and partial function application. Then it occurred to me that a partial application does not necessarily determine all the function's type parameters simultaneously, thus the partially applied function has a potentially more complex polymorphic type than the uncurried function.

I haven't thought through the details of this complication too much yet. How about denying HKT on curried functions?

Note I am not depending on monomorphisation is my researchy (unproven) Expression Problem solution proposal.

Since apparently it can be simulated with polymorphic function in a trait as I pointed out upthread, then apparently HRT doesn't break monomorphisation at least when constructed that way.

Ah I see you probably meant also this idea of yours: