Does rust really need higher kinded types?

I presume you know that subtyping requires that per-class type parameters (e.g. class Foo<A>) respect some variance rules dictated by the Liskov Substitution Principle (LSP). This doesn't apply to per-method type parameters (e.g. fn foo<A>). The class type parameters must be declared invariant, covariant, or contravariant (or this can be inferred from their unambiguous usage as follows), which then controls what subtyping is allowed on the class and in which positions input/output the class type parameters are allowed to appear.

Rust doesn't support any subtyping, but I believe this is problematic for composability because without subtyping it is impossible to populate a Collection<T> with elements of type T which are first-class unions (aka disjunctions); thus afaik it becomes impossible to add new types to the union employing the method of the Collection which appends/inserts an element short of making an entirely new copy of the Collection. I am very curious to hear how you can handle this in Rust and Haskell without violating DRY boilerplate, because afaik it is impossible. And I suspect this problem may leak into everything. I am awaiting for feedback on how this already handled.

Afaics, the higher-kinds in Haskell are the same as in a language with subtyping, with the only difference being for subtyping the per-class type parameters have a variance declaration and thus can't be used into both input and output locations on a method (but no limitations on functions which are not methods nor on per-method type parameters). Afaik in Haskell at type parameters are always invariant.

As you know and in response to some of the other comments I've read in this thread, afaik the term "higher-kinded type" (HKT) always has the same meaning, which the parametrization of a parametrized type constructor. So if the parametrization of a named (not generic) type is not a HKT. The parametrization of a generic type is a HKT.

Afaik, higher-ranked types (HRT) are multifarious binding of the type parameter separately for each use within a function, instead of one subsumed type binding for all uses.

Edit: afaik it is possible to have subtyping without subclassing. The latter is an anti-pattern because it enables/encourages breaking the invariants of LSP; whereas, afaics the former can be sound and as explained above is necessary for DRY composability.

And example of subtyping which is not subclassing is the example I mentioned above where adding an element of type E to for example Array<T> will have a result type of Array<T \/ E>. Assuming the element type of the Array is covariant, then we can see that result type is covariant in the output position. Subclassing would be the coercion of MyArray<T> to Array<T>. The coercion of Array<T \/ E> to Array<T> is not subclassing and only subtyping.

Why are you using iterators and not Enumerable?

Although I initially thought @keean's point about using iterators with coroutines overruled my objection they are too low-level, it seems on further analysis, my first intuitions may be correct.

I disagree. I have done a lot of Haskell programming and the kind of collection interfaces you propose are less generic, and less useful than the C++ STL, which is really the most well thought through collection library any language has. There are of course a few design decisions that could be improved in retrospect, but most of those are dealt with in "Elements of Programming".

Your analysis of Collection<T> seems wrong as you can have a collection of a trait object, for example Collection<&Readable> would allow any readable object to be inserted. You can even declare Readable for built-in types or types from other modules.

Variance rules are a bad thing to require, and another reason not to like subtyping. Haskell's approach with invariant types (we statically determine them at compile time) is what allows Rust to monomorphise code and achieve 'C' levels of performance. Traits (type classes) avoid the need for variances in runtime polymorphism by employing the same existential quantification method as Haskell in trait objects.

As for HKT, parameterization of a parameterized type seems an odd way to think about it, as it is simply a type constructor (or partially applied type constructor). For example, if Vec<A> is a parameterised type, Vec is an ordinary type, and Vec is a "constructor function" taking one argument. We can give this the kind "type -> type". Traditionally we write type "" in kind signatures, so that becomes " -> ". If types are the only kinds supported by our system then "" is a simple kind, and any kind signature with an arrow in (like the one for the Vec type constructor) is higher kinded. In Rust lifetime variables are a different kind of type level object, so a reference type constructor has kind "lifetime -> * -> *" but lifetime is optional so it has an overloaded kind signature with either 1 or 2 parameters (which I find a bit ugly considering Rust restricts value level function overloading with traits).

I don't understand your question. Perhaps you could show some Rust code.

2 Likes

Note the reply to @keean is obtained by reading the other thread.

It is more of abstract consideration, as explained in a post in the other thread.

An iterator is a point sampler. An enumeration operates on a range is more high-level and thus has more information to optimize with and apparently then you can invert the control, so that the resource lifetime is encapsulated in the enumerable methods, not needing to use low-level tsuris of Rust lifetime typing. I may be mistaken, but that is the intuition I have thus far and the direction I will initially be pursuing.

@keean postulates that iterators are more composable but I am thinking that no algorithms really operate on point samples, so I can't imagine an iterator is ever the most efficient abstraction. I may be wrong about that. I haven't studied it yet.

Rust will apparently need HKT if it implements my proposal:

This is still going to be problematic. Any time you have a loop that modifies a container, the program will not type check without subtyping on union types, and even then will require finding the fixed point of the collection type. If there are multiple branches through the program it will have to merge the types from each possible branch into the fixed point calculation. Whether this type system would be sound is uncertain, so would require proof.

I am still not sure what problem you are trying to fix, as in I don't really see how this union of all types in the collection is any better than a boxed trait. Yes you save having to define all the types that are a member of that "big" trait, and you can check the specific trait memberships on entry to functions, but I am not sure this is a big saving in boilerplate. I don't think this is DRY, because you are not repeating this list of traits, but I will admit it is a form of boilerplate.

However the Rust approach is to explicitly write types and traits (it only has local type inference). Haskell can infer function types and traits/type-classes automatically. In Haskell if you write a function that uses "get_area" it knows to add the HasArea requirement to the inferred type signature of the function. Rust could do this too if they wanted (inferring trait bounds is a lot simpler than type inference) and only require annotation in the case of a trait function name abiguity. It seems to me Rusts decision to require function type signatures with traits results in far more boilerplate than listing the traits for a runtime polymorphic collection, so of you wanted to reduce boilerplate then that would be the first thing to change.

I think infact there is a simpler solution, and that is to allow new trait bounds to be added to traits, that would solve the expression "problem" (I don't really think it's much of a problem) in the same way without changing the type system.

extend Shape : HasHeight

This would have to be scope limited (say to the current module) but would force all types in Shape to also implement HasHeight, so all functions in this module can use the HasHeight trait functions on members of a collection of boxed shapes. This would achieve the same thing you want without introducing union types, or a global hash or anything like that.

The remaining problem is that we don't know all the types that implement Shape, so we cannot check all the members of Shape implement Height, even though we could provide local impl of HasHeight for each type here in this module the compiler cannot know we have got them all, because some other module could put types in Shape the source of which is not available to the compiler. Maybe this could be checked at linktime depending on what the header information stored during separate compilation is. It might be worth finding this out as a next step. Finally the worst case is a runtime check, which I think would put most people off the idea.

What is missing in all of this are some good motivating examples. There needs to be some real programs you or other people are trying to write with Rust that have hit this problem and would benefit from a solution. I don't see lots of questions in the forum about how the limitations of boxed traits is causing problems for developers. I think you might want to search to see if you can find evidence that this is a practical problem. If you can find that evidence, then I would suggest looking into the link-time information Rust used to see if a runtime check could be avoided. If both those conditions are met, then it might be worth writing an RFC to see how the Rust implementation community feel about this feature.

I don't know if Rust needs HKT but HKT seems a cookie licking feature that has stopped a lot of other features from being implemented because people are waiting for the One True Abstraction that will deliver us from whatever we have now.

Examples:

1 Like

I am not sure about try/catch do you have a link? Also the "Traits for Collections" links to 'contains' method for ranges, which doesn't need to need HKT because it is already merged? What do you mean by traits for collections?

On "Traits for Collections" I had a deep link to a comment by @BurntSushi. I had said:

The trait Contains from the alternatives section is a good idea. Range should implement a Contains trait and maybe get containers to implement it too. This would be part of work to get more traits into std so programs can rely on those without worrying about where the implementation is coming from.

And he replied:

@ehiggs Ever since collections reform, there's been a sort-of moratorium on traits for collections. The primary reasoning, if I remember correctly, was to hold off until HKT. If this RFC were to move forward, I would expect contains to be an inherent method just like it is on the various containers.

Regarding try/catch: here is a link. A considerable amount of the conversation is that it shouldn't be implemented becase there may be a do notation in the future which will elegantly handle the semantics of catch.

So I don't think algorithms (which contains is) belong in collections as you end up repeating yourself. I think algorithms are best implemented as free-standing generic functions. I might be wrong as its not clear what contains is supposed to do on a collection, but it sounds like this generic algorithm from Elements of Programming:

    pub fn find<I>(mut f : I, l : &I, x : I::value_type) -> I
    where I : Iterator + Readable {
        // Precondition: readable_bounded_range(f, l)
        while (f != *l) && (*(f.source()) != x) {
            f = f.successor();
        }
        f
    } 

This single definition is all that is necessary as it is generic over all containers (by merit of being implemented on iterators) and all content types.

That algorithm might 'work' but it will be completely against the whole point of any tree-like structures. The Contains trait would be useful for ranges, polygons, higher dimensional spaces, points in space + distances (hyperballs?).

Agreed, thats why it is defined on 'Iterators'. Note you would need a different algorithm for each of those other things. A binary tree for example should provide a BifurcatingIterator, and the algorithm for finding if a value is 'below' a node would be different. A higher dimensional space would provide a coordinate-structure, which is essentially a higher-dimensional iterator.

The whole idea of generic programming is to start with the algorithms, and then classify them by their data access patterns (iterators). Then containers can provide the correct iterators.

You can then use trait overloading (where the overload depends on the traits implemented) to get the compiler to choose the correct algorithm for the type of iterators supported by your container.

Edit: this is how generic programming solves the inheritance problem. If you implement find/contains on the collection itself then you will repeat yourself a lot, implementing very similar functions on similar containers. We know that some containers should share the definition and some should not. By defining the algorithm on iterators, we can share the algorithm between all containers that provide the right type of iterator (in the above case a single pass forward iterator). We can then provide other implementations of find/contains that operate on other kinds of iterator, like the BifurcatingIterator I mentioned or N-dimensional coordinate iterators. If you do this you will be able to call the same function on any sub-range, or sub-space of the container for free, and pass the results of one find into the next, for example find element "A" in a vector, then find element "B", and then find element "C" in between them. The correct algorithm for the iterator type gets selected by the type of the iterator, and multiple different containers can share the same iterator. Even better different algorithms can abstract over different iterators, so the implementations can be shared between different containers, one algorithm shared between collection types A and B and another algorithm shared between collection types B and C. Try doing that with inheritance (you would need multiple inheritance and each algorithm in a separate superclass).

The rebuttal is here.

I am still arguing that iterators are an anti-pattern.

I realize most here probably don't agree with me (yet). But I am confident eventually I will be proven correct on that. Or I will learn I am not, but my abstract intuition usually leads me to the correct answer especially on an abstraction issue. Devil is in the details of course.

A Contains trait doesn't bind the collection type to the Contains methodology. The choices of which traits a collection implements is open to extension in the Expression Problem. And this is why we need HKT in general (not just for my proposal), because we need to not throw away data types and subsume them to trait types as I explained in the thread for my proposal.

HKT are fundamental. We must have them.

An iterator is a point sampler. An enumeration operates on a range is more high-level and thus has more information to optimize with and apparently then you can invert the control, so that the resource lifetime is encapsulated in the enumerable methods, not needing to use low-level tsuris of Rust lifetime typing. I may be mistaken, but that is the intuition I have thus far and the direction I will initially be pursuing.

It is the old "we just need a smarter compiler" argument that functional enthusiasts have been making for years, and yet where is the mythical magical compiler that can optimise bad code to be fast? Where are all the functional languages beating 'C' on the benchmark game. The only reason Rust is competitive with C/C++ is because it abandons that argument and gives you the low level control to write efficient code. On top of that it gives you traits and monomorphisation which give you zero-cost-abstraction. Great! So looking at actual languages and implementations the enumerator approach is going in the wrong direction towards the fantasy land of the "smarter compiler". Deforestation does not make Haskell faster than 'C'. Look at the empirical evidence, not the theoretical speculation.

I read your post. I'm not smart enough to talk about this problem in the abstract. I need real Rust code.

3 Likes

That is a category error in your conceptualization of what I wrote.

I am saying that iterator doesn't provide enough information for algorithms the programmer writes on top of them to be fully optimized by the programmer. Has nothing to do with the compiler's optimization.