High Order Function with Type Parameter

I think what is wanted is a intersection type like this:

fn f(g : Fn(&File) -> () ∧ Fn(&Cursor) -> ())

The intersection of two functions is a single function that can accept either signature (it is like overloading) as opposed to a union type which would be either. The reason to prefer function intersections can be seen by considering the difference between the following:

fn f(g : Fn(File) -> File ∧ Fn(Cursor) -> Cursor)
fn f(g : Fn(File ∨ Cursor) -> Fn(File ∨ Cursor))

The first guarantees to return the same type of object that you pass, the second does not. Note: they are intersections on types, not traits / type-classes, so 'Fn' would need to be a type not a trait. Intersection types do not work well with traits (type-classes) as they are both trying to control overload resolution. As Rust has taken the type-class route to control overloading, adding intersection types would not make sense to me. Higher-kinded-types work nicely with type-classes, so that is the way to go. Union types cause all sorts of problems and are generally unsafe, and Rust has disjoint-union types already (enums) which provide a save sort of union type.

HKT will provide monomorphisation, which you wanted, as does the solution I posted above where you pass the same function twice. Intersection of function types provides an alternative to HKT to do this, but I prefer HKT with traits/type-classes. Union types require (unsafe) runtime polymorphism which is the same as the runtime polymorphism provided by enums.

Afaics yes Vec<&File> is a subtype of Vec<&File ∨ &Cursor>, so thus you then need to enforce a choice of invariant, covariant, and contravariant conditions for type parameters.

f is enforcing a lower bound of Read on A. And the input is a trait object, so both File and Cursor implement the Read trait.

No that shouldn't compile because [u8] implements Read thus is a subtype of Read, and since g is a covariant input, then A is contravariant when employed in g thus only allowing you to input to g those functions with an A that is a supertype of (or a) Read. You violated the Liskov Substitution Principle. Does Rust allow that?

Note my original version was correct and I should not have changed it from a disjunction to a conjunction. I had forgotten that you were violating LSP.

I can't find any documentation of that syntax. Only this:

That is, Rust has no notion of type abstraction: there are no higher-ranked (or "forall") types
abstracted over other types, though higher-ranked types do exist for lifetimes.

Will it work as expected if not being employed for lifetimes?

Can you please elaborate or provide a resource that can explain this in more detail? Is this only applicable when global type inference is required? Are you referring to this?

Could you please show an example? I think you mean only in the case of an of an unbounded generic type?

Well you can use inheritance in a trait to define an tagged intersection, but not an untagged (first-class) intersection. So why does adding untagged traits create an incongruency? Afaics, there is no diamond multiple inheritance problem because Rust doesn't put implementation in traits.

What problems? They would break Haskell's global inference. Rust's enum is a tagged union, not a first-class untagged union.

Not unsafe. The compiler has enforced the invariants of the what runtime types can be. The runtime dispatch enables adhering to those invariants without the boilerplate of having to specialize each parametrized type for each type of (first-class/untagged or tagged Enum) union of types of that can populate the type parameter. Analogously first-class functions enable us to use functions as values and types without having to write boilerplate such as Java's method pattern for a callback before Java8.

Neither of those with enable you to populate a collection with an extensible union of element types. You seem to be comparing apples to tractors, i.e. even they can both be used to supply food, they don't solve the same need for food in all the same scenarios.

Dealing with those points one at a time:

  1. The example of HKT does not work in Rust today, as Rust does not have HKT, it is an example of what the syntax could look like, based on the current use of For for universally quantifying lifetimes. The solution where there is a function argument for each use with a different type in the generic function body works with rust as it is now.
  2. I think (1) answered that as well.
  3. HKT and Intersection types are well studied, see: http://www.macs.hw.ac.uk/cs/techreps/docs/files/HW-MACS-TR-0029.pdf but maybe I am focusing too much on type inference, you may like your code littered with type annotations?
  4. I mean that you provide two mechanisms to do the same things, for example a function to accept all readable types is:
fn read<A>(A : a) where A : Read

if File and Cursor implement Read, you could also write

fn read(File \/ Cursor)

And using intersection types you could also write:

fn read (File) -> () /\ (Cursor) -> ()

In the intersection case we care clearly overloading "read" with two implementations which we might write with Traits:

trait MyRead : Read {
    read(&Self); 
}

impl MyRead File {
    read(&Self) { ... }
}

impl MyRead Cursor {
    read(&Self) { ... }
}

So I would rather have a type system with one sound way to do things, rather than multiple alternative ways to do things. I would not want to end up like Scala with an unsound type system full of bugs. Scaling DOT to Scala - Soundness | The Scala Programming Language I find the last sentence surprising, type theory isn't really what I would call surprising, but yes, if you just grab a bunch of shiny stuff and try and bolt it together it won't be sound, I don't find that surprising.
5. I refer you to the answer (4), multiple ways to do things is bad, so unless you are suggesting getting rid of traits, Rust doesn't need another way to do the same thing. see Scala's problems.
6. As the union is untagged once you have put a value of either type into the variable you cannot ever determine which it was. You are limited to only running functions that can run successfully on either value, without knowing which value is actually there. Think about this for a while, and tell me you have a use for it? As soon as you try and work out what type the value in the union might be you are unsound (because remember its untagged). What functions can you run on a "String / Int"? None right? So maybe I overstated my case, instead I should say: to do anything useful requires writing unsound programs.
7. Sort of, if the compiler can statically prove that the union type has a String in it, then it can simplify it to simply a String, however any generic functions using the union type cannot do anything with it still, apart from pass it to the identity function :slight_smile:
8. To populate a collection with an "extensible union" of element types you want to use a trait object. This is the perfect solution for the your scenario. If you have a "FarmItem" trait and make Apples and Tractors implementations of it, you can then put them both in an array of FarmItem trait objects.

As a bonus, here's how to do what you wanted originally using traits:

    use std::io::{Read, Cursor};
    use std::fs::{File};
    trait ReadFn {
        fn read<T : Read>(&mut self, T) -> ();
    }
    fn read_test<G>(mut g : G, pred : bool) where G : ReadFn {
        if pred {
            g.read(File::open("foo").unwrap());
        } else {
            g.read(Cursor::new(b"foo"));
        }
    }
    fn main() {
        struct MyFn<'a> {
            buf : &'a mut String
        }
        impl<'a> ReadFn for MyFn<'a> {
            fn read<T : Read>(&mut self, mut a : T) -> () {
                println!("{}", a.read_to_string(self.buf).ok().unwrap());
            }
        }
        let mut s = String::from("");
        read_test(MyFn{buf : &mut s}, false); // passing mutable 'closure' variable
        println!("{}", s);
    }

We use the unit struct 'MyFn' as an argument to the generic function to select the trait implementation to use. Instead of a closure you implement ReadFn for as many different unit-structs as you like, so the unit-struct is effectively a type-level name for the function you want to pass. Now you just pass 'MyFn' to the generic function and it can apply it to all the different readable types you like in that generic function.

Edit: I actually find this solution quite elegant, and the use of traits to pass polymorphic functions makes me wonder it adding higher-kinded-types is complicating the type system unnecessarily. I would be interested to see any motivating examples for HKT, and see how they look using a trait encoding (but let's do that in a different topic).

I think you mean higher-ranked, not higher-kinded (refer to the quoted linked Scala example):

I think you are right, I moved from typing higher-ranked-types to higher-kinded-types without noticing. I don't think this is limited to rank-2 types, and can handle rank-n polymorphism. I also think you can use the same technique where you might think you need higher-kinded-types, have a look at this implementation of a lazy streaming filter. Does rust really need higher kinded types? if adapter functions are a motivation for HKT you can see we can use the same techniques to do that too.

As far as I can see this satisfies all your original requirements, you can pass closure variables, and it applies polymorphically to the arguments inside the function 'f'. I also find the Rust and Haskell implementations of this much more readable than the Scala.

The unification explosion seems to be with higher-ranked types, not higher-kinded?

Agreed the problem appears to arise due to type inference. The Conclusion section of the paper you linked, summarizes the problem as one of expansion to principled typings in the context of type inference. Principled typings are where the environment A and type are inferred(1), not the H.Milner system where principled types are deduced from the annotated environment.

I don't have a problem with Rust's and Scala's requirement for type annotations on the arguments of public functions (with the result type optionally inferred), but I would definitely not want to have to annotate the type of every let variable and anonymous function, nor have to manually coerce (cast) every type on every application of a function (including operators). Do you (or any reader) know what level of annotation is required to avoid this explosion of expansion for unification? I haven't studied this issue and these research papers.

(1) Trevor Jim (1995), "Principal typings" MIT memorandum, page 2, 5th paragraph.

@keean

In response to your numbered comments:

4a) Just because we show how a typing feature overlaps another typing feature, it doesn't necessarily follow that the two features are entirely redudant. If we can show cases where they are not substitutable features, then we can possibly justify the need for both, as long as we don't introduce undecidability, confusion, and corner cases;

4b) I agree on Scala's mistake, but afaics this is primarily due (<-- links to my comments at Scala discussion group) to attempt to integrate the anti-pattern subclassing (and virtual inheritance) with functional programming. The other unsoundness in DOT was due to the late (i.e. path dependent) quantification of abstract types, which they've now proven to be sound with some paradigm shift in representation, but I still think abstract and path dependent types are a can-of-worms to be avoided. Your thoughts?

  1. I will explain below that tagged traits are not redundant with first-class (aka untagged) unions and intersections.

  2. The runtime object can be reified (encoded) to know which type it is and thus it can know which trait implementation to invoke, when applying a trait method to an untagged (aka first-class) union (aka disjunction) type. I will be starting a new thread on this, because Rust appears to be doing it wrong by putting the pointer to the implementation dictionary in the runtime object. I will edit this post and link to the new thread after I create it, so we can discuss this in more detail in that new thread I will create.

  3. Disagree. See #6.

  4. This is not composable for one of the reasons that subclassing is an anti-pattern. I can't put an i32 and a Cursor in the same collection without running into the horrendous problems of subsumption to Any (the Top type) which infects everything include the Eq trait. I have been sporadically working for a some years on the ideas about how to completely solve the Expression Problem of extension and composability, so those links above are some of my past research on the matter.

  1. The runtime object can be reified (encoded) to know which type it is and thus it can know which trait implementation to invoke, when applying a trait method to an untagged (aka first-class) union (aka disjunction) type. I will be starting a new thread on this, because Rust appears to be doing it wrong by putting the pointer to the implementation dictionary in the runtime object. I will edit this post and link to the new thread after I create it, so we can discuss this in more detail in that new thread I will create.

If the runtime object can be encoded to know which type it is, then it is a tagged union. Think of the bits (regarding Stepanov "We like bits") You have some bits which are your data, lets say 32 bits of data, they could be an int or a float, you cannot discover which it is by looking at the bits themselves, so you need some meta-data. This metadata is called a 'tag' hence a tagged-union. Rust has tagged unions, they are called enums.

Dealing with the other points which I have less of a problem with:

The advantage of passing the dictionary instead of having a tag is that you can call the methods on any type in the collection without having to know what the type is. It can be a type that some user of you library has created, and providing they implement the correct traits you can use it in your library, even after the library is compiled. This is a solution to dependency-injection.

Overlap is general a symptom of model confusion. There are several competing models for doing things with type-systems, most of the problems come from mixing the models. Intersection and union types come from a different model to universally quantified types with type constraints (type-classes / traits). I don't like mixing models, and I prefer the universal quantification model. I find ad-hoc overloading with no control (like intersection types) to be too permissive and type inference will come up with generally useless types for example \f . (f "abc", f 123) will type fine with intersection types. Also the types get big, and unification is much more complex (and a complex type system is more likely to have bugs).

I am not a fan of subtyping, and prefer parametric polymorphism, and type class constraints.

Finally you can easily put an i32 and a Cursor in the same collection. There is the "open" way where you can add new types to the list using traits. You can create a MyCollection trait, and then implement it for i32 and Cursor, and then other people can implement MyCollection for their own types and put them in your list, and as long as you only access them using the methods of the MyCollection trait you can deal with types you don't even know about in your list. The other way is the closed way with an enum, where you create an enum with tags for i32 and Cursor, and you can then pattern match on the value to unpack it back to either and i32 or a Cursor, but to add a new type requires modifying the enum (hence this is the "closed" way).

I see I conflated 'untagged' with 'first-class'. I was thinking the tag is the static resolved typing of the nominal type. I see now that untagged and tagged terminology refers to the runtime encoding of the type. Thanks.

My points remain that first-class unions and intersections (with tagged runtime encodings) have non-redundant use cases as compared to enums and traits. I will read and respond to your points about first-class unions and intersections.

Something to read: Intersection, Universally Quantified, and Reference Types The take-away point is that combining all three leads to problems, but any two is okay. As we have mutability and reference types, then we have to choose intersection or universal quantification. Whilst there are solutions, I find them all to complex and each has some drawback. It is this complexity that puts me off, even though it may be possible to solve. The type system needs to be simple enough to reason with, and really the simpler the better.

1 Like

Yes of course (I was going to write that for #6 but figured it was obvious), but that is not extensible and composable as a first-class union, because you can't inject new trait views of the elements of the collection with the inversion-of-control (of typing) variant of the 'D' in the SOLID principle. I will explain this in more detail in the new thread I am promising to create shortly.

Agreed, but I am not yet convinced that avoiding first-class unions and intersections is wise. I will read your reference, then respond more.

Yes, this is a good point. There are two models supported here:

  1. Traits allow the extension of types that may be put into the collection, but limit the actions that may be taken on them to those in the trait. Runtime 'boxed' traits are essentially existential types from Haskell. The reason you cannot add new functions is that you cannot know which type you are accessing (so no casting), so you are limited to the functions in the trait which are implemented on every type that wants to be inserted into the collection. Note: the functions available are not extensible because all the implementors of the trait have to implement them all, extending the trait would make all the implementations not compile (unless you provide a default implementation, but thats missing the point).
  2. Enums allow new functions to be written for accessing the data, but do not allow they types in the collection to be easily extended. They are limited in the types that can be used because every function that uses the enum has to exhaustively match every type tag in the case statement to access the types. Adding a new type requires changing the enum so that the compiler can warn you about every match that does not completely cover the available options. To fail to do this would be unsound, as the program could crash when a match does not cover the type it encounters.

If you think about these two options you can see the restrictions make sense, and that there is in fact no possible solution that can allow both extension of functions and extension of data types and be sound.

You know this, but for completeness I will state...

With a subclassing model (e.g. Scala), we could not because we can't change the declarations for i32 and Cursor. With typeclass/ad-hoc polymorphism, we can because trait implementation is orthogonal to struct declarations.

You could say subclassing gives you the worst of both worlds, no extensibility of functions because you need to change the class definitions and then all the subclasses (unless you use the visitor pattern which is what I use a lot in C++ to write algorithms over runtime polymorphic data-structures, but as soon as you want to dispatch on two arguments it gets bad as you need nested visitors and N * M virtual functions), and no extensibility of datatypes because superclasses need to be defined in the class definitions. So you cannot easily write a new function that has different behaviour depending on the type passed without touching the class definition code, and you cannot easily add a type that you did not write yourself to a collection without touching its class definition.

Correction, neither intersection nor open universally quantified typing with work with reference types in the presence of computation effects.

So it is not any two of the three, but rather that reference types don't work with the subsumption elimination of the prior two; and this restriction only applies in the presence of computation effects.

This can be more readily gleemed from the example on page 2 of the reference #3 cited in the paper you linked. The reference is subsumed from nat /\ pos to its assumed supertype nat which can accept the assignment of the value 0, but the reference can also be subsumed to its other assumed supertype pos where when it is read as the value 0 then the type system has failed and is thus unsound.

But the authors don't identify the generative essence of the problem. That is the subsumption algorithm erroneously assumes that nat /\ pos is an empty set, i.e. it is assumed that all the nominal types are disjoint. But clearly nat and pos are not disjoint and thus no nominal subsumption should be allowed. But the problem is we have no way to declare the invariants of nominal types such that we know which nominal types are disjoint.

But we do have a way to declare that all nominal types are disjoint, and that is to require they are always referenced as immutable types. If the methods of our traits always declare self (aka this) to be immutable, then the entire problem disappears.

So we can combine intersection and universal quantification with reference types, if the reference types are always immutable. Which is a much more robust solution than the restrictions of interaction of reference types and subsumption elimination and/or referenced construction of open polymorphic types proposed by the two aforementioned papers.

This doesn't mean we need to require immutability everywhere.

Does that mean higher-ranked types are off the table with mutable references? I guess the only sound solution is the one I posted using traits, as I think restricting mutability would be the wrong direction for Rust (and personally I think the wrong direction in general. Turing's introduction of memory to mathematics was a great advance and we shouldn't back away from it because our analysis tools are not good enough, rather the challenge is to improve our analysis).

As you know, you've stated the Expression Problem by Philip Wadler, one of the creators of Haskell.

Afaics, the problem can be solved by inverting the control over the binding of type to data and employing first-class unions. This will be the subject of a thread I will create as alluded to already upthread, so I can get some peer review on my idea which was only vaguely laid out in my prior comments on the Scala discussion group in 2015. The idea was first explored even less fully formed with some public (and private email) discussion with one of the creators of Kotlin in 2011. I will try to explain it more concretely. You seem to be aware of all the relevant research papers, so you can probably help identify any flaws in my idea.

Afaics, that doesn't create a new problem. The problem is more fundamental. We can't have inheritance of nominal type and mutability with references (instead of copies) if we can't prove that the Liskov Substitution Principle is not violated (in general, and especially under subsumption elimination) by the invariants of the inheritance (including intersection) hierarchy of nominal types. This is a major new insight for me. I suppose this is really what it means to say "subclassing is an anti-pattern". The paper you shared was apparently published in 2009, so perhaps this understanding is fairly new and not widely promulgated?

If we assume that subtypes never violate the invariants of supertypes, then the problem is exclusive to subsumption of intersections (either first-class unions or the itersections of inherited supertypes, e.g. multiple inherited traits such as trait PosNat : Pos, Nat), perhaps a variation of Rust's borrowing semantics could be applied to insure that only the reference to the supertype (of a subsumption eliminated reference) can be mutable. Readers should be aware by intersections we mean conjunctions, and unions mean disjunctions.

It seems that mutable references are a resource lifetimes problem. Hmm.

Edit: the complete statement of the problem w.r.t. inherited intersections (e.g. the aforementioned Rust trait example) is a bit more convoluted. The problem is if any of those types that are inherited as an intersection are open to implementations, since these implementations may not respect the invariants of the implementation of the subtype of the said intersection.

Edit#2: my intuitive leaning (opinion follows) is to forsake mutable references in safe code. Copy the instance from safe to unsafe code regions and don't bother to model the resource conflicts with typing in unsafe code. Mutable code will always be unsafe. Trying to track all this explosion in complexity and type it with borrowing seems to be pita. Use mutability only for performance optimization. Even otherwise safe immutable code can still have memory and other resource allocation conflicts so we still need some safe paradigm for that, but hopefully we can do it more higher-level with an immutable foundation.