High Order Function with Type Parameter

Note that the following:

trait NatPos: Nat + Pos {

means that any type that wants to implement NatPos has to implement both Nat and Pos first.
This means that the hypothetical instance 0 whose type only implements Nat but not Pos cannot be turned into a trait object of type, e.g. Box<NatPos> because its type doesn't implement NatPos. I cannot see any type unsafety going on here.

Apology I code in my head a lot being lazy (or too rushed to slow down) to write it down. I thought my description was clear, but I guess code is more comprehensible.

trait Nat {
   fn set(&self, u32) -> (); // accepts any unsigned value
   fn get(&self) -> u32;     // returns any unsigned value

trait Pos {
   fn set(&self, u32) -> (); // does not accept the 0 value input
   fn get(&self) -> u32;     // must only return positive integers > 0

trait NatPos: Nat + Pos;

fn main() {
   let mut x : NatPos = 1; // constructors were elided because my knowledge of Rust syntax is spotty and I am lazy to make a new method, but you get the pseudocode idea
   let z : Pos = &x;
      let mut y : Nat = &x;
   fprint("Positive integer: {}", z.get()); // error! prints "Positive integer: 0"

Hope from my code example you can now see that the problem of how subsumption with mutable access via sharing reference violates your assumption of safety.

Your example is flawed as it mixes up the concepts of types and traits.
Traits are comparable to Haskell's type-classes and can only be implemented on types, not on specific instances.

The question whether 0 is Nat or Pos doesn't make sense because 0 is a value and not a type. What makes sense is a question like "Does u32 implement the trait Nat or the trait Pos?"

This discussion would greatly benefit from an example that actually compiles and clearly shows type unsafety. Otherwise, we're just chasing our own tails.

I elided the implementation of instances, because (at a high level of abstract understanding) it isn't necessarily to visualize the problem.

Afaics, that is irrelevant to the general problem at-hand. The issue is that we end up with a shared mutable reference to two incompatible (i.e. not disjoint) implementations operating on the same data (because the invariants stated in the comments of the code can't be checked by the compiler and thus the compiler doesn't know that implementations of Nat and Pos are not disjoint and thus the compiler doesn't know that it shouldn't allow subsumption of NatPos to Nat and Pos separately).

It is clear to me already. But if you need more explicitness of code in order to visualize it, I understand. I will endeavor to produce a compilable example. Please wait.

Although your syntax is a bit off, it is this operation that is impossible. You cannot 'cast out' of a trait-object. Once you put a type in, you can never get it out again. All you can do is call the functions of the traits it implements on it. Read up on "existential types" in Haskell :slight_smile:

I will verify that shortly with a compiled code example.

If you are correct that subsumption is not possible in Rust (which may be true because I read some where that the only subtyping supported is for lifetimes), then can you explain why my example is erroneous for Java, Scala, and C++? Are you claiming they do not allow subsumed assignment?

I wasn't implicating Haskell. I know it doesn't have intersection inheritance subtyping, which is obvious since it puts Bottom at the top of the hierarchy thus it has Filinski sums but not products (or maybe it is vice versa, I forget the terminology at the moment).

Edit: and afaics if Rust doesn't have subsumption subtyping that is going to cause a cascade of issues with composability. But those are details we will have to sort out, as I think we are by analyzing use cases and how to code them in Rust. If I am not mistaken, you seem to be implying that we can model the necessary composability using concepts from OOHaskell.

There are two ways to safely implement Pos for a type backed by a u32. One is to use a nominal type with a private field, and enforce the invariant in the set method. In this case, it would be incorrect to implement Nat for the same type, or otherwise allow less-restricted access to the type's private data:

struct Pos32(u32);

impl Pos for Pos32 {
   fn set(&mut self, val: u32) { assert!(val > 0); self.0 = val; }
   fn get(&self) -> u32 { self.0 }

If the type's data is public, on the other hand, it would need to enforce the Pos invariant in Pos::get:

impl Pos for u32 {
   fn set(&mut self, val: u32) { *self = val; }
   fn get(&self) -> u32 { if *self > 0 { *self } else { 1 } } // or this could simply panic

Here is a starting point. I believe that keeans analysis is right. Note that Nat has to be a struct (a concrete type). Otherwise you cannot initialize x.


EDIT: Or at least you haver to define a concrete type that implements Nat (as proposed by mbrubeck)

EDIT 2: Perhaps the following is closer to what you are looking for

@keean is correct that Rust doesn't allow subsumption.

Playground URL: Rust Playground
Gist URL: https://gist.github.com/637ad297456f9d85b7e4c46fa99bf3f5

But other popular languages ostensibly do, so afaics there are two remaining issues to sort out:

  1. Do the other languages which offer subsumption (for references to mutable instances) exhibit the unsafety I have alleged?

  2. What composability and extension does Rust lose by not allowing any subtyping? How will we express the various use cases? Must we employ OOHaskell idioms to get the functionality we need for use cases?

Afaics, that is not compile-time safety. That is compile-time unsafety and a runtime bug. You can't fix all cases even after N occurrences of the bug, because you don't know where the bugs are in the code at compile-time. And thus this wouldn't help us for C++, Java, Scala, etc.. Afaics, the cited research paper doesn't allow/admit that runtime check as a solution to the problem to the compile-time unsafety.

Now we know the issue doesn't arise in Rust, because Rust's typing system does not allow any subtyping and thus no subsumption. But we need to sort out what are the (positive and) negative implications of the lack of first-class subtyping, if any. @keean appears to be suggesting we can model subtyping with OOHaskell paradigms.

This reminds me a lot of this old thread: Rust type system and the Circle-ellipse problem - language design - Rust Internals

One comment there noted that the subtyping is essentially reversed for the mutable versus immutable methods. That is, if you split each of these traits in two (GetPos and SetPos, plus GetNat and SetNat) then GetPos can be a subtype of GetNat (i.e. it is safe to coerce a GetPos to a GetNat), while SetNat can be a subtype of SetPos (it is safe to coerce a SetNat to a SetPos).

Coming back to this now, I realize there is no way that we could have subsumption with Rust's trait objects because NatPos does not implement Nat and Pos separately, because a trait can't implement a trait (which is essentially equivalent to @troiganto's comment).

I was rushing when I wrote down that code, and on reflection obviously a trait object NatPos can't be cast to an implementaion of struct Unsigned for traits Nat or Pos, because the Rust compiler does not know which implementation of NatPos the reference points to (this is only known at runtime for the trait objects dictionary).

Such can only be possible if Nat and Pos are classes in an OOP language (or traits in Scala) which can contain their own implementation. Perhaps it is my familiarity with Scala and its apparent conflation of trait with classes (in a subclassing model that linearizes trait implementations) that may have caused my mental mode to be incorrect about expectation for traits in Rust. I do note that Rust has default methods for traits though, so in theory Nat and Pos could have their own implementations (which may have been another factor that caused me to conflate them with Scala traits semantics), but nevertheless Rust doesn't allow casting a trait object from NatPos to either Nat or Pos.

So I think I can now answer my own question #1 by concluding that subclassing languages (C++, Java, Scala, etc) appear to have this problem that I outlined and it is a well known as the Circle-Ellipse (aka Square-Rectangle) violation of the Liskov Substitution Principle (LSP) due to mutable references and subsumption elimination. Subsumption elimination occurs when casting or assigning (i.e. coercing) NatPos to one of its base types in a subclassing language. The Circle-Ellipse problem is due to the ability to mutate a Circle with the Ellipse's interface.

What do we do when we have a NatPos trait object and a function requires a Pos or a Nat trait object input?

If we implement our data type on both Nat and Pos, then if we have a reference to one, we can't cast it to a reference to the other. That is good because it avoids the bug where Nat has a value of 0 as I explained upthread. But consider Read and ReadWrite traits instead of Nat and Pos. If we have a ReadWrite trait object, we can't call a function that expects a Read trait object. This is the motivation for intersections, so that Read and ReadWrite are subtypes of Read /\ Write.

Robert Harper wrote:

There are two fundamental problems with type classes. The first is that they insist that a type can implement a type class in exactly one way. For example, according to the philosophy of type classes, the integers can be ordered in precisely one way (the usual ordering), but obviously there are many orderings (say, by divisibility) of interest. The second is that they confound two separate issues: specifying how a type implements a type class and specifying when such a specification should be used during type inference. As a consequence, using type classes is, in Greg Morrisett’s term, like steering the Queen Mary: you have to get this hulking mass pointed in the right direction so that the inference mechanism resolves things the way you want it to.

Reactions to Robert Harper's points.

Thus subtyping with intersections appear to be necessary for functional composition, but they open the Pandora's box to the problem I have explained. I am curious to hear if anyone has a solution that provides both composition and avoids the problem of violations of LSP that I have explained?

The only solution I currently envision is immutable references (Copy types could still be mutable) combined with subtyping and intersections. Then I want to consider making the dictionaries for the trait objects orthogonal to the objects, so that it is possible to track data types at compile-time and supply the dictionaries as a separate inputs to the function, which I think may be a complete solution to the Expression Problem (but I may also be mistaken). This is the new thread I was contemplating to write, but I am also trying to better understand other strategies that have been mentioned such as OOHaskell.

I am incorporating what I want to respond on that thread into my statements above.

Although I didn't really digest all that discussion, that appears to be the reversal of the relationship between covariance and contravariance (given by the LSP) given by relative direction of inheritance for the nominal type and the methods of the nominal type. This reversal occurs again for each parameter of a method and again for each parameter of a parameter, etc..

Haskell can "solve" this with structural typing, e.g. use a type parameter without a Read lower bound for the function input argument, so that Read or ReadWrite will match the methods applicable to a Read which are employed within the function.

Although Rust has unbounded generic function type parameters, it apparently doesn't support structural typing on unbounded function type parameters.

I don't see the problem. You have a Read trait, a Write trait, and a ReadWrite trait (which is simply defined : Read + Write). Note ReadWrite defines no new methods.

Functions specify the traits they require, and objects implement the ones they provide. So a function for reading data requires Read. A function that writes requires Write, and a function that does both can require ReadWrite.

Types are implemented for the functionality they provide, so a read only file provides Read, a write only file provides Write, and a mutable file provides both Read and Write (note you cannot provide ReadWrite without providing Read and Write that is what adding bounds to a trait means the same as it means for a function, and ReadWrite itself provides no new functionality).

So you can read from a read only file and not write, as expected. Personally I blame object-oriented languages for making this seem hard, when it should be easy :slight_smile:

We don't really care about whether typing is nominal or structural, as for a generic a type is simply a parameter, so:

f<A>(a : A) -> A where A  : Read

Is a function on all A that implement the Read trait, and it actually makes no difference if A is typed nominally or structurally as all we care about is that it implements Read. This can be checked eagerly (like Rust) or lazily (like C++) but that does not affect the meaning, only when errors get reported. I am sure you are aware Rust's strict type checking produces much better error messages than C++'s templates do with their duck-typing.

Afaics, I already stated why that isn't a solution:

Let me reword that first sentence to make it more clear that afaics it refutes your assumption: If we implement our data type on both Read and Write, then if we have a trait object reference to one, we can't cast it to a trait object reference to the other.

In other words, your conceptualization only works when the caller is in possession of the struct but not when the caller only has a reference to the trait object. Since Rust can't do subsumption elimination, it is unable to coerce a trait object ReadWrite to a Read or a Write.

The problem is that if we've only got a trait object Write or ReadWrite, then we can't coerce to a Read. Thus the only way to match the function is structurally by methods. This is ostensibly how Haskell "works" around the problem, but you are more expert in Haskell than me so please correct me if I'm mistaken.

Here's a proof that, as @keean claims, a ReadWrite trait object can be coerced to both Read and Write (Playground link):

// The traits in question.

trait Read {
    fn read(&self);

trait Write {
    fn write(&mut self);

// Only those types may implement ReadWrite
// that implement both Read and Write.
trait ReadWrite: Read + Write {}

// Our data type.
struct SomeData;

// Trait implementations.

impl Read for SomeData {
    fn read(&self) {
        println!("read called");

impl Write for SomeData {
    fn write(&mut self) {
        println!("write called");

impl ReadWrite for SomeData {}

// And now using all that.

fn main() {
    // Get trait object.
    let mut trait_object: Box<ReadWrite> = Box::new(SomeData);
    // Implicit cast to Read.
    // Implicit cast to Write.

Note that, while ReadWrite trait objects can be coerced to both Read and Write, it is obvious that Read trait objects cannot be coerced to Write nor vice versa.

I hope this is what you're talking about, @shelby3. If not, a code sample of what situation you're visualizing would be really helpful for me.

That is proof that Rust can do structural typed coercion (matching the methods of a Read or Write), as I said is possible in Haskell. But that is not a coercion from ReadWrite trait object to Read or Write trait object as shown in the following code example, which afaics has significant deleterious implications on composability and extensibilty.

Playground URL: Rust Playground
Gist URL: Shared via Rust Playground · GitHub

P.S. I am also assuming Haskell can do the following structural typing which Rust apparently can't do (because Haskell has global type inference and Rust doesn't):

Playground URL: Rust Playground
Gist URL: https://gist.github.com/ac062158ecb93798285455a6e889ebb4

I'm having a hard time understanding exactly what is meant by the various examples (and haven't read the papers yet), but a few points:

  • Don't forget the borrow checking rules! You can only have one mutable reference to an object at a time, and no immutable references at the same time as a mutable reference, although this can be worked around with e.g. RefCell. I'm not sure whether this actually helps address the purported unsafety, but various examples here trivially violate it.
  • A reference to a trait object can never change its true underlying type (this is impossible as it could require changing the amount of storage required) - however, it is possible to get back a (mutable or immutable) reference to the original type using Any or a custom method, and casting to a supertrait trait object can also be implemented with a custom method (e.g. putting fn cast_to_pos(&self) -> &Pos in NatPos). You can also of course do things like change the variant of a sum type, or change the referent of a pointer to a trait object.
  • edit: This also means that an object cannot be mutated in a way that changes the set of traits it implements, as that would require changing the underlying type.
  • This example code fundamentally doesn't make sense because impl Pos for Unsigned is a claim that all Unsigned objects support the Pos interface, when in fact they can be zero. Nat and NatPos have nothing to do with it.

To fix this example you basically just need to add trait bounds to the type parameters of read and write:

fn read<A: Read + ?Sized>(x:Box<A>) -> () { x.read(); }
fn write<A: Write + ?Sized>(mut x:Box<A>) -> () { x.write(); }

There's another minor issue which is that you can't call both read and write on the same object, as Box<T> is a linear (well, affine) type and, as written, those functions consume the object. You probably want to use references (i.e. borrowing) instead.

1 Like

As @comex points out, you need to apply the trait bounds, and also avoid the issue with using a moved value. I've put the full example up in a Gist on playpen so you can see it: Rust Playground

Of course, if you want to reuse the trait object in two calls, you would use a borrowed trait object rather than an owned trait object. This happens to be a case in which you can't use deref coercions, so you have to write out the somewhat ugly &* to get a reference to the value contained in the Box: Rust Playground