High Order Function with Type Parameter

There are two ways to safely implement Pos for a type backed by a u32. One is to use a nominal type with a private field, and enforce the invariant in the set method. In this case, it would be incorrect to implement Nat for the same type, or otherwise allow less-restricted access to the type's private data:

struct Pos32(u32);

impl Pos for Pos32 {
   fn set(&mut self, val: u32) { assert!(val > 0); self.0 = val; }
   fn get(&self) -> u32 { self.0 }
}

If the type's data is public, on the other hand, it would need to enforce the Pos invariant in Pos::get:

impl Pos for u32 {
   fn set(&mut self, val: u32) { *self = val; }
   fn get(&self) -> u32 { if *self > 0 { *self } else { 1 } } // or this could simply panic
}

Here is a starting point. I believe that keeans analysis is right. Note that Nat has to be a struct (a concrete type). Otherwise you cannot initialize x.

https://gist.github.com/b1354afa1220d057176ad3639cf12498

EDIT: Or at least you haver to define a concrete type that implements Nat (as proposed by mbrubeck)

EDIT 2: Perhaps the following is closer to what you are looking for

@keean is correct that Rust doesn't allow subsumption.

Playground URL: Rust Playground
Gist URL: Shared via Rust Playground · GitHub

But other popular languages ostensibly do, so afaics there are two remaining issues to sort out:

  1. Do the other languages which offer subsumption (for references to mutable instances) exhibit the unsafety I have alleged?

  2. What composability and extension does Rust lose by not allowing any subtyping? How will we express the various use cases? Must we employ OOHaskell idioms to get the functionality we need for use cases?

Afaics, that is not compile-time safety. That is compile-time unsafety and a runtime bug. You can't fix all cases even after N occurrences of the bug, because you don't know where the bugs are in the code at compile-time. And thus this wouldn't help us for C++, Java, Scala, etc.. Afaics, the cited research paper doesn't allow/admit that runtime check as a solution to the problem to the compile-time unsafety.

Now we know the issue doesn't arise in Rust, because Rust's typing system does not allow any subtyping and thus no subsumption. But we need to sort out what are the (positive and) negative implications of the lack of first-class subtyping, if any. @keean appears to be suggesting we can model subtyping with OOHaskell paradigms.

This reminds me a lot of this old thread: Rust type system and the Circle-ellipse problem - language design - Rust Internals

One comment there noted that the subtyping is essentially reversed for the mutable versus immutable methods. That is, if you split each of these traits in two (GetPos and SetPos, plus GetNat and SetNat) then GetPos can be a subtype of GetNat (i.e. it is safe to coerce a GetPos to a GetNat), while SetNat can be a subtype of SetPos (it is safe to coerce a SetNat to a SetPos).

Coming back to this now, I realize there is no way that we could have subsumption with Rust's trait objects because NatPos does not implement Nat and Pos separately, because a trait can't implement a trait (which is essentially equivalent to @troiganto's comment).

I was rushing when I wrote down that code, and on reflection obviously a trait object NatPos can't be cast to an implementaion of struct Unsigned for traits Nat or Pos, because the Rust compiler does not know which implementation of NatPos the reference points to (this is only known at runtime for the trait objects dictionary).

Such can only be possible if Nat and Pos are classes in an OOP language (or traits in Scala) which can contain their own implementation. Perhaps it is my familiarity with Scala and its apparent conflation of trait with classes (in a subclassing model that linearizes trait implementations) that may have caused my mental mode to be incorrect about expectation for traits in Rust. I do note that Rust has default methods for traits though, so in theory Nat and Pos could have their own implementations (which may have been another factor that caused me to conflate them with Scala traits semantics), but nevertheless Rust doesn't allow casting a trait object from NatPos to either Nat or Pos.

So I think I can now answer my own question #1 by concluding that subclassing languages (C++, Java, Scala, etc) appear to have this problem that I outlined and it is a well known as the Circle-Ellipse (aka Square-Rectangle) violation of the Liskov Substitution Principle (LSP) due to mutable references and subsumption elimination. Subsumption elimination occurs when casting or assigning (i.e. coercing) NatPos to one of its base types in a subclassing language. The Circle-Ellipse problem is due to the ability to mutate a Circle with the Ellipse's interface.

What do we do when we have a NatPos trait object and a function requires a Pos or a Nat trait object input?

If we implement our data type on both Nat and Pos, then if we have a reference to one, we can't cast it to a reference to the other. That is good because it avoids the bug where Nat has a value of 0 as I explained upthread. But consider Read and ReadWrite traits instead of Nat and Pos. If we have a ReadWrite trait object, we can't call a function that expects a Read trait object. This is the motivation for intersections, so that Read and ReadWrite are subtypes of Read /\ Write.

Robert Harper wrote:

There are two fundamental problems with type classes. The first is that they insist that a type can implement a type class in exactly one way. For example, according to the philosophy of type classes, the integers can be ordered in precisely one way (the usual ordering), but obviously there are many orderings (say, by divisibility) of interest. The second is that they confound two separate issues: specifying how a type implements a type class and specifying when such a specification should be used during type inference. As a consequence, using type classes is, in Greg Morrisett’s term, like steering the Queen Mary: you have to get this hulking mass pointed in the right direction so that the inference mechanism resolves things the way you want it to.

Reactions to Robert Harper's points.

Thus subtyping with intersections appear to be necessary for functional composition, but they open the Pandora's box to the problem I have explained. I am curious to hear if anyone has a solution that provides both composition and avoids the problem of violations of LSP that I have explained?

The only solution I currently envision is immutable references (Copy types could still be mutable) combined with subtyping and intersections. Then I want to consider making the dictionaries for the trait objects orthogonal to the objects, so that it is possible to track data types at compile-time and supply the dictionaries as a separate inputs to the function, which I think may be a complete solution to the Expression Problem (but I may also be mistaken). This is the new thread I was contemplating to write, but I am also trying to better understand other strategies that have been mentioned such as OOHaskell.


I am incorporating what I want to respond on that thread into my statements above.

Although I didn't really digest all that discussion, that appears to be the reversal of the relationship between covariance and contravariance (given by the LSP) given by relative direction of inheritance for the nominal type and the methods of the nominal type. This reversal occurs again for each parameter of a method and again for each parameter of a parameter, etc..

Haskell can "solve" this with structural typing, e.g. use a type parameter without a Read lower bound for the function input argument, so that Read or ReadWrite will match the methods applicable to a Read which are employed within the function.

Although Rust has unbounded generic function type parameters, it apparently doesn't support structural typing on unbounded function type parameters.

I don't see the problem. You have a Read trait, a Write trait, and a ReadWrite trait (which is simply defined : Read + Write). Note ReadWrite defines no new methods.

Functions specify the traits they require, and objects implement the ones they provide. So a function for reading data requires Read. A function that writes requires Write, and a function that does both can require ReadWrite.

Types are implemented for the functionality they provide, so a read only file provides Read, a write only file provides Write, and a mutable file provides both Read and Write (note you cannot provide ReadWrite without providing Read and Write that is what adding bounds to a trait means the same as it means for a function, and ReadWrite itself provides no new functionality).

So you can read from a read only file and not write, as expected. Personally I blame object-oriented languages for making this seem hard, when it should be easy :slight_smile:

We don't really care about whether typing is nominal or structural, as for a generic a type is simply a parameter, so:

f<A>(a : A) -> A where A  : Read

Is a function on all A that implement the Read trait, and it actually makes no difference if A is typed nominally or structurally as all we care about is that it implements Read. This can be checked eagerly (like Rust) or lazily (like C++) but that does not affect the meaning, only when errors get reported. I am sure you are aware Rust's strict type checking produces much better error messages than C++'s templates do with their duck-typing.

Afaics, I already stated why that isn't a solution:

Let me reword that first sentence to make it more clear that afaics it refutes your assumption: If we implement our data type on both Read and Write, then if we have a trait object reference to one, we can't cast it to a trait object reference to the other.

In other words, your conceptualization only works when the caller is in possession of the struct but not when the caller only has a reference to the trait object. Since Rust can't do subsumption elimination, it is unable to coerce a trait object ReadWrite to a Read or a Write.

The problem is that if we've only got a trait object Write or ReadWrite, then we can't coerce to a Read. Thus the only way to match the function is structurally by methods. This is ostensibly how Haskell "works" around the problem, but you are more expert in Haskell than me so please correct me if I'm mistaken.

Here's a proof that, as @keean claims, a ReadWrite trait object can be coerced to both Read and Write (Playground link):

// The traits in question.

trait Read {
    fn read(&self);
}

trait Write {
    fn write(&mut self);
}

// Only those types may implement ReadWrite
// that implement both Read and Write.
trait ReadWrite: Read + Write {}

// Our data type.
struct SomeData;

// Trait implementations.

impl Read for SomeData {
    fn read(&self) {
        println!("read called");
    }
}

impl Write for SomeData {
    fn write(&mut self) {
        println!("write called");
    }
}

impl ReadWrite for SomeData {}

// And now using all that.

fn main() {
    // Get trait object.
    let mut trait_object: Box<ReadWrite> = Box::new(SomeData);
    // Implicit cast to Read.
    trait_object.read();
    // Implicit cast to Write.
    trait_object.write();
}

Note that, while ReadWrite trait objects can be coerced to both Read and Write, it is obvious that Read trait objects cannot be coerced to Write nor vice versa.

I hope this is what you're talking about, @shelby3. If not, a code sample of what situation you're visualizing would be really helpful for me.

That is proof that Rust can do structural typed coercion (matching the methods of a Read or Write), as I said is possible in Haskell. But that is not a coercion from ReadWrite trait object to Read or Write trait object as shown in the following code example, which afaics has significant deleterious implications on composability and extensibilty.

Playground URL: Rust Playground
Gist URL: Shared via Rust Playground · GitHub

P.S. I am also assuming Haskell can do the following structural typing which Rust apparently can't do (because Haskell has global type inference and Rust doesn't):

Playground URL: Rust Playground
Gist URL: Shared via Rust Playground · GitHub

I'm having a hard time understanding exactly what is meant by the various examples (and haven't read the papers yet), but a few points:

  • Don't forget the borrow checking rules! You can only have one mutable reference to an object at a time, and no immutable references at the same time as a mutable reference, although this can be worked around with e.g. RefCell. I'm not sure whether this actually helps address the purported unsafety, but various examples here trivially violate it.
  • A reference to a trait object can never change its true underlying type (this is impossible as it could require changing the amount of storage required) - however, it is possible to get back a (mutable or immutable) reference to the original type using Any or a custom method, and casting to a supertrait trait object can also be implemented with a custom method (e.g. putting fn cast_to_pos(&self) -> &Pos in NatPos). You can also of course do things like change the variant of a sum type, or change the referent of a pointer to a trait object.
  • edit: This also means that an object cannot be mutated in a way that changes the set of traits it implements, as that would require changing the underlying type.
  • This example code fundamentally doesn't make sense because impl Pos for Unsigned is a claim that all Unsigned objects support the Pos interface, when in fact they can be zero. Nat and NatPos have nothing to do with it.

To fix this example you basically just need to add trait bounds to the type parameters of read and write:

fn read<A: Read + ?Sized>(x:Box<A>) -> () { x.read(); }
fn write<A: Write + ?Sized>(mut x:Box<A>) -> () { x.write(); }

There's another minor issue which is that you can't call both read and write on the same object, as Box<T> is a linear (well, affine) type and, as written, those functions consume the object. You probably want to use references (i.e. borrowing) instead.

1 Like

As @comex points out, you need to apply the trait bounds, and also avoid the issue with using a moved value. I've put the full example up in a Gist on playpen so you can see it: Rust Playground

Of course, if you want to reuse the trait object in two calls, you would use a borrowed trait object rather than an owned trait object. This happens to be a case in which you can't use deref coercions, so you have to write out the somewhat ugly &* to get a reference to the value contained in the Box: Rust Playground

Afaics, the issue I raised after reading the research paper @keean introduced, has nothing to do with resource lifetimes. The issue is about intersection of types and subsumption elimination.

Agree, but afaics the issue here is whether we can subsume (coerce) a mutable reference with the type Read /\ Write (aka in Rust trait ReadWrite : Read + Write) to a mutable reference with the type Read or Write.

Okay great! I did:

Playground URL: Rust Playground

So this tells me Rust can do subsumption elimination contrary to what I thought @keean had shown based on my prior example code. So now I need to return to my prior argument that mutable referenced subsumption elimination with intersection is unsafe. So I need to compose a new code example to show that, now that I understand how to make it compile.

That is the entire point. That the compiler allows to coerce a mutable reference for a trait object with type NatPos to type Nat to set a 0 value on a Nat, then coerce the original NatPos it to a Pos and read a value of 0 out of the Pos reference. I need to construct a code example to show that unsafety.

Are you claiming an unsigned is not a postive integer. I understand you are complaining (as if it is my fault) about the infamous Circle-Ellipse and Square-Rectangle dilemma of mutably referenced subtyping as I explained upthread. I am confident you will now agree it is just a well known dilemma and not my mistake. As I explained upthread, the unsafety can only be avoided by enforcing immutability. I am contemplating that is why Haskell doesn't have this unsafety.


As I've indicated previously, resource lifetime annotation is a feature I am currently not interested in and am avoiding learning and including in my code examples. Whether I will be later convinced of its utility is still an open question for me. I am more focused on the typing system orthogonal to the lifetimes at least for now.

I don't think you have shown any unsafety. I think you are explaining something about types and trying to apply it to constraints on types (traits), which seems like a category error to me.

If I have a Box I can only put things in that implement both Read and Write. The type of the thing in the box never changes, in fact it cannot change. All you are doing is looking at the same type through different 'lenses'.

Afaics, this is proof that Rust has the same unsafety as shown in the research paper @keean cited.

Playground URL: Rust Playground

The invariants are violated same as in the example shown in the research paper you cited. What is the distinction you see?

There are no type coercions happening here.

Okay I have looked at the code and there is nothing surprising, you are trying to enforce a constraint on writing to a type that provides two interfaces, what do you expect? You can obviously write using either interface, and you don't need anything that complex to do it.

Subsumption elimination has been applied to the intersection type Nat + Pos.

Because Nat and Pos are orthogonal interfaces, which is entirely the means of composability. I expect that the compiler can't check that my invariants are disjoint, thus can't insure safety of my invariants.

You can't without a coercion to subsume from the intersection type Nat + Pos to the orthogonal types Nat and Pos of the intersection. The problem is as I stated from the start of the discussion upthread: the assumption of orthogonality is erroneous because the compiler has no way to check that the invariants of Nat and Pos are disjoint.

There is a type SomeData that is u32, and you wrote a 0 to it legitimately using write from the Nat interface, and then read it using read it using read from the Pos interface. You implemented both interfaces on the type, so its working as expected? The question is can a single thing be both Pos and Nat? I don't see unsoundness here, I just see a bad categorisation.