Does rust really need higher kinded types?

What about abstraction over Rc<T>, Arc<T>, Box<C> etc? Rust's another unique point is the diversity of polymorphic types that are used for memory handling. I know little about the workarounds, but I'd imagine HKT would be a boon for writing functions that are polymorphic over the pointer types.

2 Likes

Does Rust need another mechanism to do this when it has traits? For example the Deref operator '*' is a trait function that abstracts over all reference types, and can be extended to dereference any container that might be written in the future.

Ah, yeah, this was basically a brainfart, of course traits (and generic impls) help with simple cases. I was trying to bring up something like this, from Gankro's impressive(ly long) blog post: The Many Kinds of Code Reuse in Rust

3 Likes

You're right. Providing better abstractions over different ownership shapes is a use case for higher-kinded polymorphism that is specific to Rust and not that hard to 'grok' for most people. I think they're a less pressing need than StreamingIterator-style interfaces, but they are still very useful.

2 Likes

Thanks for sharing this! I was trying to do precisely this type of thing to abstract over mutable/non-mutable references to resolve my initial question in DRYing nearly identical implementations for &T and &mut T and gave up. Gankro's trick of separating out a LifetimeToType<'a> trait (effectively, separating out parametrization by T and by 'a into orthogonal entities) was the trick I failed to figure out (why that's needed is better seen by trying to implement it oneself). Nice!

Edit: Except, disappointingly, he still had 3 implementations — one for each Iter type, whereas, I was trying to have exactly one for &T and &mut T that was parametrized over the Ref type. This requires additional abstraction over, say, as_ref for the former and as_mut for the latter. But Gankro is demonstrating allowing other types of reference types with the same iterator trait — still very neat!

I presume you know that subtyping requires that per-class type parameters (e.g. class Foo<A>) respect some variance rules dictated by the Liskov Substitution Principle (LSP). This doesn't apply to per-method type parameters (e.g. fn foo<A>). The class type parameters must be declared invariant, covariant, or contravariant (or this can be inferred from their unambiguous usage as follows), which then controls what subtyping is allowed on the class and in which positions input/output the class type parameters are allowed to appear.

Rust doesn't support any subtyping, but I believe this is problematic for composability because without subtyping it is impossible to populate a Collection<T> with elements of type T which are first-class unions (aka disjunctions); thus afaik it becomes impossible to add new types to the union employing the method of the Collection which appends/inserts an element short of making an entirely new copy of the Collection. I am very curious to hear how you can handle this in Rust and Haskell without violating DRY boilerplate, because afaik it is impossible. And I suspect this problem may leak into everything. I am awaiting for feedback on how this already handled.

Afaics, the higher-kinds in Haskell are the same as in a language with subtyping, with the only difference being for subtyping the per-class type parameters have a variance declaration and thus can't be used into both input and output locations on a method (but no limitations on functions which are not methods nor on per-method type parameters). Afaik in Haskell at type parameters are always invariant.

As you know and in response to some of the other comments I've read in this thread, afaik the term "higher-kinded type" (HKT) always has the same meaning, which the parametrization of a parametrized type constructor. So if the parametrization of a named (not generic) type is not a HKT. The parametrization of a generic type is a HKT.

Afaik, higher-ranked types (HRT) are multifarious binding of the type parameter separately for each use within a function, instead of one subsumed type binding for all uses.

Edit: afaik it is possible to have subtyping without subclassing. The latter is an anti-pattern because it enables/encourages breaking the invariants of LSP; whereas, afaics the former can be sound and as explained above is necessary for DRY composability.

And example of subtyping which is not subclassing is the example I mentioned above where adding an element of type E to for example Array<T> will have a result type of Array<T \/ E>. Assuming the element type of the Array is covariant, then we can see that result type is covariant in the output position. Subclassing would be the coercion of MyArray<T> to Array<T>. The coercion of Array<T \/ E> to Array<T> is not subclassing and only subtyping.

Why are you using iterators and not Enumerable?

Although I initially thought @keean's point about using iterators with coroutines overruled my objection they are too low-level, it seems on further analysis, my first intuitions may be correct.

I disagree. I have done a lot of Haskell programming and the kind of collection interfaces you propose are less generic, and less useful than the C++ STL, which is really the most well thought through collection library any language has. There are of course a few design decisions that could be improved in retrospect, but most of those are dealt with in "Elements of Programming".

Your analysis of Collection<T> seems wrong as you can have a collection of a trait object, for example Collection<&Readable> would allow any readable object to be inserted. You can even declare Readable for built-in types or types from other modules.

Variance rules are a bad thing to require, and another reason not to like subtyping. Haskell's approach with invariant types (we statically determine them at compile time) is what allows Rust to monomorphise code and achieve 'C' levels of performance. Traits (type classes) avoid the need for variances in runtime polymorphism by employing the same existential quantification method as Haskell in trait objects.

As for HKT, parameterization of a parameterized type seems an odd way to think about it, as it is simply a type constructor (or partially applied type constructor). For example, if Vec<A> is a parameterised type, Vec is an ordinary type, and Vec is a "constructor function" taking one argument. We can give this the kind "type -> type". Traditionally we write type "" in kind signatures, so that becomes " -> ". If types are the only kinds supported by our system then "" is a simple kind, and any kind signature with an arrow in (like the one for the Vec type constructor) is higher kinded. In Rust lifetime variables are a different kind of type level object, so a reference type constructor has kind "lifetime -> * -> *" but lifetime is optional so it has an overloaded kind signature with either 1 or 2 parameters (which I find a bit ugly considering Rust restricts value level function overloading with traits).

I don't understand your question. Perhaps you could show some Rust code.

2 Likes

Note the reply to @keean is obtained by reading the other thread.

It is more of abstract consideration, as explained in a post in the other thread.

An iterator is a point sampler. An enumeration operates on a range is more high-level and thus has more information to optimize with and apparently then you can invert the control, so that the resource lifetime is encapsulated in the enumerable methods, not needing to use low-level tsuris of Rust lifetime typing. I may be mistaken, but that is the intuition I have thus far and the direction I will initially be pursuing.

@keean postulates that iterators are more composable but I am thinking that no algorithms really operate on point samples, so I can't imagine an iterator is ever the most efficient abstraction. I may be wrong about that. I haven't studied it yet.

Rust will apparently need HKT if it implements my proposal:

This is still going to be problematic. Any time you have a loop that modifies a container, the program will not type check without subtyping on union types, and even then will require finding the fixed point of the collection type. If there are multiple branches through the program it will have to merge the types from each possible branch into the fixed point calculation. Whether this type system would be sound is uncertain, so would require proof.

I am still not sure what problem you are trying to fix, as in I don't really see how this union of all types in the collection is any better than a boxed trait. Yes you save having to define all the types that are a member of that "big" trait, and you can check the specific trait memberships on entry to functions, but I am not sure this is a big saving in boilerplate. I don't think this is DRY, because you are not repeating this list of traits, but I will admit it is a form of boilerplate.

However the Rust approach is to explicitly write types and traits (it only has local type inference). Haskell can infer function types and traits/type-classes automatically. In Haskell if you write a function that uses "get_area" it knows to add the HasArea requirement to the inferred type signature of the function. Rust could do this too if they wanted (inferring trait bounds is a lot simpler than type inference) and only require annotation in the case of a trait function name abiguity. It seems to me Rusts decision to require function type signatures with traits results in far more boilerplate than listing the traits for a runtime polymorphic collection, so of you wanted to reduce boilerplate then that would be the first thing to change.

I think infact there is a simpler solution, and that is to allow new trait bounds to be added to traits, that would solve the expression "problem" (I don't really think it's much of a problem) in the same way without changing the type system.

extend Shape : HasHeight

This would have to be scope limited (say to the current module) but would force all types in Shape to also implement HasHeight, so all functions in this module can use the HasHeight trait functions on members of a collection of boxed shapes. This would achieve the same thing you want without introducing union types, or a global hash or anything like that.

The remaining problem is that we don't know all the types that implement Shape, so we cannot check all the members of Shape implement Height, even though we could provide local impl of HasHeight for each type here in this module the compiler cannot know we have got them all, because some other module could put types in Shape the source of which is not available to the compiler. Maybe this could be checked at linktime depending on what the header information stored during separate compilation is. It might be worth finding this out as a next step. Finally the worst case is a runtime check, which I think would put most people off the idea.

What is missing in all of this are some good motivating examples. There needs to be some real programs you or other people are trying to write with Rust that have hit this problem and would benefit from a solution. I don't see lots of questions in the forum about how the limitations of boxed traits is causing problems for developers. I think you might want to search to see if you can find evidence that this is a practical problem. If you can find that evidence, then I would suggest looking into the link-time information Rust used to see if a runtime check could be avoided. If both those conditions are met, then it might be worth writing an RFC to see how the Rust implementation community feel about this feature.

I don't know if Rust needs HKT but HKT seems a cookie licking feature that has stopped a lot of other features from being implemented because people are waiting for the One True Abstraction that will deliver us from whatever we have now.

Examples:

1 Like

I am not sure about try/catch do you have a link? Also the "Traits for Collections" links to 'contains' method for ranges, which doesn't need to need HKT because it is already merged? What do you mean by traits for collections?

On "Traits for Collections" I had a deep link to a comment by @BurntSushi. I had said:

The trait Contains from the alternatives section is a good idea. Range should implement a Contains trait and maybe get containers to implement it too. This would be part of work to get more traits into std so programs can rely on those without worrying about where the implementation is coming from.

And he replied:

@ehiggs Ever since collections reform, there's been a sort-of moratorium on traits for collections. The primary reasoning, if I remember correctly, was to hold off until HKT. If this RFC were to move forward, I would expect contains to be an inherent method just like it is on the various containers.

Regarding try/catch: here is a link. A considerable amount of the conversation is that it shouldn't be implemented becase there may be a do notation in the future which will elegantly handle the semantics of catch.

So I don't think algorithms (which contains is) belong in collections as you end up repeating yourself. I think algorithms are best implemented as free-standing generic functions. I might be wrong as its not clear what contains is supposed to do on a collection, but it sounds like this generic algorithm from Elements of Programming:

    pub fn find<I>(mut f : I, l : &I, x : I::value_type) -> I
    where I : Iterator + Readable {
        // Precondition: readable_bounded_range(f, l)
        while (f != *l) && (*(f.source()) != x) {
            f = f.successor();
        }
        f
    } 

This single definition is all that is necessary as it is generic over all containers (by merit of being implemented on iterators) and all content types.

That algorithm might 'work' but it will be completely against the whole point of any tree-like structures. The Contains trait would be useful for ranges, polygons, higher dimensional spaces, points in space + distances (hyperballs?).

Agreed, thats why it is defined on 'Iterators'. Note you would need a different algorithm for each of those other things. A binary tree for example should provide a BifurcatingIterator, and the algorithm for finding if a value is 'below' a node would be different. A higher dimensional space would provide a coordinate-structure, which is essentially a higher-dimensional iterator.

The whole idea of generic programming is to start with the algorithms, and then classify them by their data access patterns (iterators). Then containers can provide the correct iterators.

You can then use trait overloading (where the overload depends on the traits implemented) to get the compiler to choose the correct algorithm for the type of iterators supported by your container.

Edit: this is how generic programming solves the inheritance problem. If you implement find/contains on the collection itself then you will repeat yourself a lot, implementing very similar functions on similar containers. We know that some containers should share the definition and some should not. By defining the algorithm on iterators, we can share the algorithm between all containers that provide the right type of iterator (in the above case a single pass forward iterator). We can then provide other implementations of find/contains that operate on other kinds of iterator, like the BifurcatingIterator I mentioned or N-dimensional coordinate iterators. If you do this you will be able to call the same function on any sub-range, or sub-space of the container for free, and pass the results of one find into the next, for example find element "A" in a vector, then find element "B", and then find element "C" in between them. The correct algorithm for the iterator type gets selected by the type of the iterator, and multiple different containers can share the same iterator. Even better different algorithms can abstract over different iterators, so the implementations can be shared between different containers, one algorithm shared between collection types A and B and another algorithm shared between collection types B and C. Try doing that with inheritance (you would need multiple inheritance and each algorithm in a separate superclass).