High Order Function with Type Parameter

The unification explosion seems to be with higher-ranked types, not higher-kinded?

Agreed the problem appears to arise due to type inference. The Conclusion section of the paper you linked, summarizes the problem as one of expansion to principled typings in the context of type inference. Principled typings are where the environment A and type are inferred(1), not the H.Milner system where principled types are deduced from the annotated environment.

I don't have a problem with Rust's and Scala's requirement for type annotations on the arguments of public functions (with the result type optionally inferred), but I would definitely not want to have to annotate the type of every let variable and anonymous function, nor have to manually coerce (cast) every type on every application of a function (including operators). Do you (or any reader) know what level of annotation is required to avoid this explosion of expansion for unification? I haven't studied this issue and these research papers.

(1) Trevor Jim (1995), "Principal typings" MIT memorandum, page 2, 5th paragraph.


In response to your numbered comments:

4a) Just because we show how a typing feature overlaps another typing feature, it doesn't necessarily follow that the two features are entirely redudant. If we can show cases where they are not substitutable features, then we can possibly justify the need for both, as long as we don't introduce undecidability, confusion, and corner cases;

4b) I agree on Scala's mistake, but afaics this is primarily due (<-- links to my comments at Scala discussion group) to attempt to integrate the anti-pattern subclassing (and virtual inheritance) with functional programming. The other unsoundness in DOT was due to the late (i.e. path dependent) quantification of abstract types, which they've now proven to be sound with some paradigm shift in representation, but I still think abstract and path dependent types are a can-of-worms to be avoided. Your thoughts?

  1. I will explain below that tagged traits are not redundant with first-class (aka untagged) unions and intersections.

  2. The runtime object can be reified (encoded) to know which type it is and thus it can know which trait implementation to invoke, when applying a trait method to an untagged (aka first-class) union (aka disjunction) type. I will be starting a new thread on this, because Rust appears to be doing it wrong by putting the pointer to the implementation dictionary in the runtime object. I will edit this post and link to the new thread after I create it, so we can discuss this in more detail in that new thread I will create.

  3. Disagree. See #6.

  4. This is not composable for one of the reasons that subclassing is an anti-pattern. I can't put an i32 and a Cursor in the same collection without running into the horrendous problems of subsumption to Any (the Top type) which infects everything include the Eq trait. I have been sporadically working for a some years on the ideas about how to completely solve the Expression Problem of extension and composability, so those links above are some of my past research on the matter.

  1. The runtime object can be reified (encoded) to know which type it is and thus it can know which trait implementation to invoke, when applying a trait method to an untagged (aka first-class) union (aka disjunction) type. I will be starting a new thread on this, because Rust appears to be doing it wrong by putting the pointer to the implementation dictionary in the runtime object. I will edit this post and link to the new thread after I create it, so we can discuss this in more detail in that new thread I will create.

If the runtime object can be encoded to know which type it is, then it is a tagged union. Think of the bits (regarding Stepanov "We like bits") You have some bits which are your data, lets say 32 bits of data, they could be an int or a float, you cannot discover which it is by looking at the bits themselves, so you need some meta-data. This metadata is called a 'tag' hence a tagged-union. Rust has tagged unions, they are called enums.

Dealing with the other points which I have less of a problem with:

The advantage of passing the dictionary instead of having a tag is that you can call the methods on any type in the collection without having to know what the type is. It can be a type that some user of you library has created, and providing they implement the correct traits you can use it in your library, even after the library is compiled. This is a solution to dependency-injection.

Overlap is general a symptom of model confusion. There are several competing models for doing things with type-systems, most of the problems come from mixing the models. Intersection and union types come from a different model to universally quantified types with type constraints (type-classes / traits). I don't like mixing models, and I prefer the universal quantification model. I find ad-hoc overloading with no control (like intersection types) to be too permissive and type inference will come up with generally useless types for example \f . (f "abc", f 123) will type fine with intersection types. Also the types get big, and unification is much more complex (and a complex type system is more likely to have bugs).

I am not a fan of subtyping, and prefer parametric polymorphism, and type class constraints.

Finally you can easily put an i32 and a Cursor in the same collection. There is the "open" way where you can add new types to the list using traits. You can create a MyCollection trait, and then implement it for i32 and Cursor, and then other people can implement MyCollection for their own types and put them in your list, and as long as you only access them using the methods of the MyCollection trait you can deal with types you don't even know about in your list. The other way is the closed way with an enum, where you create an enum with tags for i32 and Cursor, and you can then pattern match on the value to unpack it back to either and i32 or a Cursor, but to add a new type requires modifying the enum (hence this is the "closed" way).

I see I conflated 'untagged' with 'first-class'. I was thinking the tag is the static resolved typing of the nominal type. I see now that untagged and tagged terminology refers to the runtime encoding of the type. Thanks.

My points remain that first-class unions and intersections (with tagged runtime encodings) have non-redundant use cases as compared to enums and traits. I will read and respond to your points about first-class unions and intersections.

Something to read: Intersection, Universally Quantified, and Reference Types The take-away point is that combining all three leads to problems, but any two is okay. As we have mutability and reference types, then we have to choose intersection or universal quantification. Whilst there are solutions, I find them all to complex and each has some drawback. It is this complexity that puts me off, even though it may be possible to solve. The type system needs to be simple enough to reason with, and really the simpler the better.

1 Like

Yes of course (I was going to write that for #6 but figured it was obvious), but that is not extensible and composable as a first-class union, because you can't inject new trait views of the elements of the collection with the inversion-of-control (of typing) variant of the 'D' in the SOLID principle. I will explain this in more detail in the new thread I am promising to create shortly.

Agreed, but I am not yet convinced that avoiding first-class unions and intersections is wise. I will read your reference, then respond more.

Yes, this is a good point. There are two models supported here:

  1. Traits allow the extension of types that may be put into the collection, but limit the actions that may be taken on them to those in the trait. Runtime 'boxed' traits are essentially existential types from Haskell. The reason you cannot add new functions is that you cannot know which type you are accessing (so no casting), so you are limited to the functions in the trait which are implemented on every type that wants to be inserted into the collection. Note: the functions available are not extensible because all the implementors of the trait have to implement them all, extending the trait would make all the implementations not compile (unless you provide a default implementation, but thats missing the point).
  2. Enums allow new functions to be written for accessing the data, but do not allow they types in the collection to be easily extended. They are limited in the types that can be used because every function that uses the enum has to exhaustively match every type tag in the case statement to access the types. Adding a new type requires changing the enum so that the compiler can warn you about every match that does not completely cover the available options. To fail to do this would be unsound, as the program could crash when a match does not cover the type it encounters.

If you think about these two options you can see the restrictions make sense, and that there is in fact no possible solution that can allow both extension of functions and extension of data types and be sound.

You know this, but for completeness I will state...

With a subclassing model (e.g. Scala), we could not because we can't change the declarations for i32 and Cursor. With typeclass/ad-hoc polymorphism, we can because trait implementation is orthogonal to struct declarations.

You could say subclassing gives you the worst of both worlds, no extensibility of functions because you need to change the class definitions and then all the subclasses (unless you use the visitor pattern which is what I use a lot in C++ to write algorithms over runtime polymorphic data-structures, but as soon as you want to dispatch on two arguments it gets bad as you need nested visitors and N * M virtual functions), and no extensibility of datatypes because superclasses need to be defined in the class definitions. So you cannot easily write a new function that has different behaviour depending on the type passed without touching the class definition code, and you cannot easily add a type that you did not write yourself to a collection without touching its class definition.

Correction, neither intersection nor open universally quantified typing with work with reference types in the presence of computation effects.

So it is not any two of the three, but rather that reference types don't work with the subsumption elimination of the prior two; and this restriction only applies in the presence of computation effects.

This can be more readily gleemed from the example on page 2 of the reference #3 cited in the paper you linked. The reference is subsumed from nat /\ pos to its assumed supertype nat which can accept the assignment of the value 0, but the reference can also be subsumed to its other assumed supertype pos where when it is read as the value 0 then the type system has failed and is thus unsound.

But the authors don't identify the generative essence of the problem. That is the subsumption algorithm erroneously assumes that nat /\ pos is an empty set, i.e. it is assumed that all the nominal types are disjoint. But clearly nat and pos are not disjoint and thus no nominal subsumption should be allowed. But the problem is we have no way to declare the invariants of nominal types such that we know which nominal types are disjoint.

But we do have a way to declare that all nominal types are disjoint, and that is to require they are always referenced as immutable types. If the methods of our traits always declare self (aka this) to be immutable, then the entire problem disappears.

So we can combine intersection and universal quantification with reference types, if the reference types are always immutable. Which is a much more robust solution than the restrictions of interaction of reference types and subsumption elimination and/or referenced construction of open polymorphic types proposed by the two aforementioned papers.

This doesn't mean we need to require immutability everywhere.

Does that mean higher-ranked types are off the table with mutable references? I guess the only sound solution is the one I posted using traits, as I think restricting mutability would be the wrong direction for Rust (and personally I think the wrong direction in general. Turing's introduction of memory to mathematics was a great advance and we shouldn't back away from it because our analysis tools are not good enough, rather the challenge is to improve our analysis).

As you know, you've stated the Expression Problem by Philip Wadler, one of the creators of Haskell.

Afaics, the problem can be solved by inverting the control over the binding of type to data and employing first-class unions. This will be the subject of a thread I will create as alluded to already upthread, so I can get some peer review on my idea which was only vaguely laid out in my prior comments on the Scala discussion group in 2015. The idea was first explored even less fully formed with some public (and private email) discussion with one of the creators of Kotlin in 2011. I will try to explain it more concretely. You seem to be aware of all the relevant research papers, so you can probably help identify any flaws in my idea.

Afaics, that doesn't create a new problem. The problem is more fundamental. We can't have inheritance of nominal type and mutability with references (instead of copies) if we can't prove that the Liskov Substitution Principle is not violated (in general, and especially under subsumption elimination) by the invariants of the inheritance (including intersection) hierarchy of nominal types. This is a major new insight for me. I suppose this is really what it means to say "subclassing is an anti-pattern". The paper you shared was apparently published in 2009, so perhaps this understanding is fairly new and not widely promulgated?

If we assume that subtypes never violate the invariants of supertypes, then the problem is exclusive to subsumption of intersections (either first-class unions or the itersections of inherited supertypes, e.g. multiple inherited traits such as trait PosNat : Pos, Nat), perhaps a variation of Rust's borrowing semantics could be applied to insure that only the reference to the supertype (of a subsumption eliminated reference) can be mutable. Readers should be aware by intersections we mean conjunctions, and unions mean disjunctions.

It seems that mutable references are a resource lifetimes problem. Hmm.

Edit: the complete statement of the problem w.r.t. inherited intersections (e.g. the aforementioned Rust trait example) is a bit more convoluted. The problem is if any of those types that are inherited as an intersection are open to implementations, since these implementations may not respect the invariants of the implementation of the subtype of the said intersection.

Edit#2: my intuitive leaning (opinion follows) is to forsake mutable references in safe code. Copy the instance from safe to unsafe code regions and don't bother to model the resource conflicts with typing in unsafe code. Mutable code will always be unsafe. Trying to track all this explosion in complexity and type it with borrowing seems to be pita. Use mutability only for performance optimization. Even otherwise safe immutable code can still have memory and other resource allocation conflicts so we still need some safe paradigm for that, but hopefully we can do it more higher-level with an immutable foundation.

For me losing mutability looses too many interesting algorithms. There are many great algorithms that require mutability, like the union-find algorithm, which is essential to doing many things efficiently. The performance difference is staggering, your mobile phone can out-perform a super-computer if the phone is allowed to use the union-find with path compression and weighted union and the super-computer is limited to a niave read-only version.

Note I did not intend to suggest not allowing mutability. I wanted to suggest that is not worth it to try to guarantee type safety of (some future feature of first-class intersections or) the currently implemented open traits with mutable references by bolting on some very complex lifetimes scheme for those two if it can even be done (and I am not referring to the resource lifetimes scheme currently in Rust). Do you realize I am asserting from my insight from the paper you provided that Rust's trait system is unsafe now (unless they implemented the restrictive fix from the paper you provided)?

Haskell does not have that problem with ad hoc polymorphism because it is immutable every where. Rust has ostensibly opened a can-of-worms already. Now I understand why ad hoc polymorphism was not added to mutable languages. I was wondering why it hadn't been done!

Edit: note mutability with Copy types is safe. I think there is still a way to encode mutable algorithms safely.

Collections are one of the most useful tools in a programming language, like Sets, Maps, Lists, Vectors. To write efficient programs these need to be mutable, and collections need to be passed by reference (otherwise inserts and deletes remain local). If you copy the collection for every insert or delete then you are back to immutable data, and inefficient implementations. Immutability is an illusion anyway, as even in functional languages, the language itself is mutating data in memory (the stack is a classic example where even though stack variables may not be mutated, the actual stack memory is as functions push and pop values).

C++ has ad-hoc polymorphism and mutability, but the ad-hoc composition is in template specialisation. This is really the same trick that Haskell uses because you are composing read-only program fragments which themselves mutate references, which is what happens in Haskell monads like State, ST and IO. If you look at C++ "Concepts" (as in Stepanov's concepts in Elements) you can see that they make the templates type-safe. It turns out that Concepts are effectively type-classes by another name. So we have a mechanism where immutable things (program fragments) can be composed safely but those immutable things can be programs that manipulate state. So Rust's traits are actually a way of safely getting something like higher-ranked types and mutability in the same language. Further by using parametric types (like C++ templates) rather than universally quantified types (like Haskell) we can get sound monomorphisation. Note parametric types do not allow polymorphic recursion (which universally quantified types do) which would break monomorphisation.

As an aside, this is the reason why Haskell uses functors and monads to encapsulate state and IO in a pure functional language. As Rust is not pure it does not really need functors or monads in the same way that Haskell does. Objects can be mutable and are the correct ideom for a language like Rust. If you look at the OOHaskell paper I liked to above you can see that an Object is in fact the fixed-point of a record in a monad (hence the object constructor is "mfix").

The conclusion is traits/type-classes and parametric types are a sweet-spot for a type system that can compose imperative code in a functional-like way whist maintaining efficiency of low level code like C/C++. I suspect that adding functional type-system things like higher-ranked or higher-kinded types will break monomorphisation, and lead to unsoundness (and needing a lot of special cases and complexity to recover soundness like Scala).


Note I hope we agree that this issue is very important; thus the need to retain some skepticism until convinced we have correctly/completely understood and clairvoyantly articulated this issue in sufficient detail. We must be very careful that when we claim analogies between language features and typing models, that we haven't missed some crucial details. Handwaving can be counter productive, if it is perceived to be fact.

Afaics, the currently relevant unsafety problem that I have gleemed from the Intersection, Universally Quantified, and Reference Types white paper you cited, only applies in the case where mutable references are combined with intersection inheritance (e.g. trait NatPos : Nat, Pos) wherein the intersected supertypes can have implementations which do not have disjoint semantics, e.g. Nat and Pos are not actually disjoint sets.

This unsafety arises because the compiler assumes all intersected nominal types (i.e. which are not super/subtypes) are disjoint (even when they are not). Thus the compiler will erroneously allow subsumption (assignment) of an instance of NatPos to either Nat and/or Pos; and thus assignment (i.e. writing) of a 0 value to the instance referenced with a type Nat (natural numbers). And this same instance is also read by a reference of type Pos, which is thus an unsafe type error at runtime because 0 is not a valid value for the Pos (positive integers) type's sematics.

Afaics, this unsafety exists in Rust now, but... the unsafety apparently also exists in any programming language that allows subsuming a subtype separately to each of its intersected supertypes, while also allowing read/write access to instances of those supertypes via references that allow mutable (i.e. write) access to the referenced instance.

Is there any disagreement about this claim, and if so, can anyone explain any error in my aforementioned conceptualization?

Thus I was apparently mistaken (due to being awake too long when I should have slept) when I implied this problem applies only to ad hoc polymorphism. This unsafety is due to the fact that subtyping is unable to check that the semantics of nominal types are disjoint (in the presence of mutable access via references), yet the type system implicitly assumes that all intersected nominal types are disjoint. Thus only linearized inheritance would absolve this unsafety (because super/subtypes aren't assumed to be disjoint) in the presence of mutable references (but tangentially, even that wouldn't absolve the anti-pattern of subclassing for other unrelated reasons). Thus so many popular programming languages are implicated, including C++, Java, etc. I am thinking the reason this problem has been ignored in the past, is because most of the implicated languages also provide the anti-pattern subclassing which also leads to virtual methods violations of the Liskov Substitution Principle, thus I am thinking the industry punted and decided it was not possible to check the invariants so allows the programmer to manually check for inconsistency of inheritance semantics. In that sense, the above linked research paper apparently offers the red-herring of an old unsafety problem that has been pervasively, de facto ignored.

Note also that subtyping subsumption with mutable access via reference (even without the intersected inheritance), is equivalent to virtual method access; thus the virtual method aspect of the subclassing anti-pattern "sneaks in through the back door" even if the programming language doesn't offer virtual method dispatch in classes.

I don't think the unsafety exists in Rust because Rust does not have universally quantified or intersection types. Rust has parametric types. The difference is that in Haskell this is a valid type "forall a . a -> a" for the identity function, however the Rust equivalent "<A>(a : A) -> A" is different because it is not actually a type (note you cannot pass generic definitions to functions). In rust the 'generic' only becomes a type when you substitute a type. The difference shows up in the lack of polymorphic recursion in Rust (and that is a good thing because polymorphic recursion would break monomorphisation).

Can you please explain how my example is erroneous?

I think that research paper is incorrect to state that the problem only applies to universally quantitied types. Afaics, the unsafety is a more fundamental problem with intersected inheritance and mutable access via references. Please explain what is my error in the above quoted example?

I am not sure I understand your example, but all types are instantiated to concrete types by the compiler (this is what monomorphisation is). If any type variable cannot be grounded to a concrete type at compile time this is an error and compilation will fail (and a concrete type is a single type). The exception to this are trait objects, which all implement a common interface, so we can call the dictionary functions without every knowing the actual type.

So what happens if you assign a Nat where you expect a Pos, well first we have to have a reference, and that reference has to be to a trait-object (otherwise we must statically know it is either a Nat or a Pos there is variance) and all types put in the reference must implement the trait, and you can only ever access the contents by the trait methods. You can never get the type out again and you can never find out what type it is. This is the same as an existential type in Haskell.