Design patterns for composability with traits (i.e. typeclasses)?

Thanks. I will also.

The only solution I can see is immutability. But perhaps we can do it without copying the entire vector, at least for simple operations such as append. For more complex operations where the self type needs to be mutable, then T can't be changed by the operation. My proposal is more amenable to List.

Well that is good, so maybe we don't need subtyping.

I remember now that long ago I had reasoned out that my idea required immutability when adding data types to the union of the collection ValueType. And I had also realized it eliminated the need for subtyping. Actually it requires there be no subtyping. I had forgotten those details, but quickly reminded once @keean helped rediscover the issue upthread. I may have written those points publicly some where in the past.

Essentially afaics my proposal is combining first-class type unions, first-class trait intersections, a global (or scoped?) hash dynamic dispatch, and optional immutability to allow for first-class heterogeneous collections which are open in both dimensions of Wadler's Expression Problem, i.e. can add new data types and new interfaces orthogonally without recompiling nor boilerplate from before what one already has at any location in the source code. And this apparently eliminates subtyping entirely (which typeclasses didn't have any way, e.g. Haskell, Rust, and PureScript).

But I wrote that only immutability can be allowed with my proposal for first-class union types. And subtyping can't be allowed and won't be required if immutability is required. So your stated problem doesn't exist in my proposal.

A boxed trait is open to extension w.r.t. to the traits that can be applied to the collection without rebuilding another copy of the collection. That is a boxed trait is open in both directions of the Expression Problem. I have already explained this several times.

You are conflating the orthogonal issues of scoping, inference, and the Expression Problem. My proposal is to enable extension in both dimensions of the Expression Problem. It is orthogonal to those other two issues. There is nothing you can propose on the other two issues which can solve the Expression Problem, because they are orthogonal issues (an abstract analysis convinces me of this orthogonality but you may need to convince yourself by trying and rejecting numerous proposals).

That belies understanding of why the Expression Problem is fundamental. We can't solve the Expression Problem with scoping, analogous to that we can't substitute scoping for first-class functions and retain the same composability.

The scoping you proposed only constrains when adding to add collection, but it doesn't allow (nor adequately substitute for) the degree-of-freedom of later binding the matching of data types to implementations when we want to apply the collection or one of its element items as the input to a function which requires a trait. You are requiring that the caller of that stated function is also the creator of that collection or has required that trait on his caller. But that defies dependency injection, which is a fundamental golden rule of composability. Even your Sort example upthread is employing dependency injection (yet you only can vary along one dimension of the Expression Problem because you don't have my proposal implemented in any language on earth yet).

If you still retain that same stance after this post, please provide rationale for why solving the Expression Problem is not an important problem ???

The motivating example is as I had stated upthread, that any time one needs a heterogeneous collection and they want composability to be open in both directions of the Expression Problem, i.e. they can call any function that the union of data types implements on the required trait. Whereas, without my proposal then there can't be dependency injection in terms of which traits a collection's element type (aka upthread ValueType) can implement. It is really elementary and simple. And I am quite perplexed why no one in computer science has entertained such an idea and solution.

But I wrote that only immutability can be allowed with my proposal for first-class union types. And subtyping can't be allowed and won't be required if immutability is required. So your stated problem doesn't exist in my proposal.

I don't think you understand the problem. Try typing this (what is the type of the container):

let container = Container::new();
loop {
    appendItem(container, "A");
}

That belies understanding of why the Expression Problem is fundamental. We can't solve the Expression Problem with scoping, analogous to that we can't substitute scoping for first-class functions and retain the same composability.

I think you misunderstand my proposal, the scoping limitation is to keep things reasonable. You can ignore it for now if you want to convince yourself of the equivalence of the two methods. Lets say the following:

You can extend Shape with a new trait "HasHeight" anywhere you like in the program. If a function requires HasHeight is called on the collection then all types that are a member of Shape have to prove they implement HasHeight (as viewed from the call site of the function).

If you agree this is equivalent I will explain why the restrictions are desirable.

The motivating example is as I had stated upthread, that any time one needs a heterogeneous collection and they want composability to be open in both directions of the Expression Problem, i.e. they can call any function that the union of data types implements on the required trait. Whereas, without my proposal then there can't be dependency injection in terms of which traits a collection's element type (aka upthread ValueType) can implement. It is really elementary and simple. And I am quite perplexed why no one in computer science has entertained such an idea and solution.

This is an abstract motivation, not a concrete motivating example. You need to provide examples of real world programs that people are implementing and that are running into problems with this. For example pick a real problem like implementing a chess player, or a painting program, or something, anything, and start implementing it. When you find you cannot do something, or find that something is really ugly because of this problem then you have a motivating example. Either you will have lots of examples from your own programming history where this has caused you actual problems, or you can search the internet and forums and find other developers developing real programs that have found this to be a problem.

I said appending when changing the union is immutable. Thus your example code above won't typecheck as it requires a mutable reference but the method signature is:

Let me clarify that based on the discussion that followed that quoted example:

trait Vec<T> {
   fn appendItem(&mut self, T) -> Self<T>;
   fn appendItem<A>(&self, A) -> Self<T \/ A>;
}

Afaics that isn't sound, because this requires the linker to be a compiler. In short, you need to compile all the linked modules at the same time. There can't be any more modularity in code. I could write some new code that breaks code that used to compile in another module.

Even though afaics it can't be sound, I am still interested to read your thoughts about advantages.

I said appending when changing the union is immutable. Thus your example code above won't typecheck as it requires a mutable reference but the method signature is:

Isn't a collection you cannot change a bit useless? How can I do anything with it?

Afaics that isn't sound, because this requires the linker to be a compiler. In short, you need to compile all the linked modules at the same time. There can't be any more modularity in code. I could write some new code that breaks code that used to compile in another module.

No, its fine, it just needs the header files to list the types that implement each trait. The compiler automatically generates a header/interface for each file it compiles. This contains encoded versions of the type signatures of the public functions in a module (otherwise it could not type check when you import a module without repassing the source code). If we extend the interface generated with trait information, which types implement which traits, and which traits are bounds on which other traits, then the compiler can complete these checks in the link phase.

Edit: If you think about it, some, if not all, of this information must be there, so that when you use a module it knows which types inside that module implement which traits (whether defined in that module or another). This is done without recompiling the used modules (unless it has changed since the last compile).

In any case this is no less sound than your own suggestion which would be required to build the 'union' of types across all source modules. In effect the trait represents the union of types that implement that trait.

Remember soundness is the property of a type system to prevent programs that go wrong at runtime from passing the type checker. So with the unions, if you have to check whether a type in the collection implements the trait at runtime, your static type system is unsound.

I don't know why you repeat this incorrect assumption, when I have explained upthread that this is not the case presuming the type system is not allowing subtyping, which means per my example then the type system must enforce immutability when adding an item that has a new data type to the first-class union of the heterogeneous collection. Perhaps you meant in the case that immutability is not enforced by the type system, but as we showed upthread that wouldn't be sound, so I wouldn't create a type system that allows it.

Note even when instead of a first-class union of data types, we employ the conjunction of traits as the ValueType (which btw is not open to both dimensions of the Expression Problem as we explain upthread) then still can't employ subtyping because Self<T /\ A> is not a subtype of Self<T>:

trait Vec<T> {
   fn appendItem(&mut self, T) -> Self<T>;
   fn appendItem<A>(&self, A) -> Self<T \/ A>;
   fn appendItem<A>(&self, A) -> Self<T /\ A>;  // Note this can't never be called any way, bcz a conjunction of traits can't be extended bcz we've erased the data types so compiler can't enforce if implementations exist in scope for the added traits
}

So subtyping will never be used in the proposed programming language. Subtyping doesn't exist in Rust now, except for the corner case of lifetimes (which my abstract intuition leads me to believe I will eventually show Rust's lifetimes is an anti-pattern model).

As I showed in my prior example, operations (including appending) which are not adding a new type to the first-class union VaueType (that is the element type of the heterogeneous collection), will type check with a mutable reference to the collection. And as I also showed in my prior example, operations (including appending) which are adding a new type to the first-class union VaueType, will not type check with a mutable reference to the collection.

Additionally, when appending an item which adds a new type to the first-class union VaueType, it is not required to not copy the entire collection in some cases of collection types, e.g. a List and other types of collections which are inherently encapsulated after append w.r.t. to the ValueType. It is even possible to make a DiffArray function like an array for random access, yet make it function like a List for immutability of ValueType. I had posted in 2011 a rough conceptual idea which sort of illustrates the conceptual strategies which can be employed to model mutability of arrays in an immutable context. The Clojure collections exhibit other strategies. There does appear to be at least a logarithmic performance penalty for immutability, but again note the distinction here is that I am only proposing to incur that penalty when adding a new data type to the union which is the ValueType (aka element type) of the collection. So with my proposal we get extension in both directions (dimensions) of the Expression Problem, yet we don't prematurely erase the union of data types and pay no penalty for not erasing them, until if ever we need to do dependency injection extension along the data type direction of the Expression Problem by actually adding a new data type to preexisting element type of a preexisting collection which we couldn't otherwise do (this dependency injection) in the current way Rust, Haskell, and PureScript (typeclass languages) are. Of course subtyping+subclassing languages such as C++, C#, Scala, OCaml, etc.. are hopeless anti-patterns.

It is difficult to explain where it is unsound because afaics you have not specified all the details such a design will require, e.g. how can the compiler convey to the linker that adding a data type to a collection was dependent on a change to the collection's element type at a particular scope. Essentially you have just converted the linker into a compiler, which was my point before. Afaics there simply isn't any way to remove the invariant that the data types that were already in the collection have been erased from the compiler's AST by the time you need to add the new trait to conjunction which is the element type. I am 99.9% confident when you work through the details, you will discover it is unsound and can't be fixed.

How many do you want because the list of use case examples is unbounded. :wink:

For example the web browser's DOM has collections and they are typed according to the known element types, but if we want to make a typed interface with a typeclass language (i.e. not the subtyping+subclassing anti-pattern). So if we want to extend on top of the DOM our own data types, but be interoperable with preexisting code, we could not do it without my proposal. Of course currently the DOM is not modeled with a typeclass language structure, so there are no such preexisting code, but I am speaking about the current way to do it once I'm done replacing the web browser (my current project goal which is why I need a new programming language).

Note again that modeling the DOM with the subtyping+subclassing anti-pattern also does not allow for extension on both directions (dimensions) of the Expression Problem.

Can we talk about some code? How about this: Rust Playground

We have 3 modules, one defines circle, one square and one a shapes collection and processing function. We can put the circle in the shapes collection because it implements HasArea. We can easily extend with the new type Shape that is defined in a separate module by defining an implementation of HasArea, all without changing any of the code in modules a, b & c. Good so far.

To extend in the other direction, lets say we want to get the colour of every shape in the collection, we can obviously define new implementations of HasColour for the shapes that need it (Circle) in the same way we did for HasArea, the problem is we are required to modify module c to change the collection trait.

We could fix this by allowing 'main' to extend the trait bounds of Shape, as I suggested. This will require some form of whole program analysis to work out the size of the Box (but I think it will however you try and solve this).

Perhaps it is time to think about implementation, why do we have to do this? Its because there needs to be space reserved in the box for each dictionary at runtime, so that the methods can be looked up. This works by constant offset so the dictionary is just an array of memory addresses for an indirect jump. So we need to know how many jump tables to include in the Box and this number needs to be fixed at compile time. How would your solution be implemented?

We need both typings of the above interface, because when the data type(s) are erased as an input to a function which requires a (possibly union of) trait interface(s), e.g. Appendable or Enumerable, then there is no way to apply the first HKT typed interface because it requires the compiler knows the data type(s), e.g. Vec. For example, where C:Enumerable<T> is a type or possibly a union of types (such as Vec<T> \/ List<T>) that implement the trait Enumerable<T> in my proposed hypothetical programming language:

fn foo<T, C>(x: C) where T:Add+Nullable, C:Enumerable<T> {
   fn f(prev: T, cur: T) {
      if prev == null { return cur; }
      else { return prev + cur; }
   }
   x.fold(null, f);
}

Tangentially (btw), I wish Rust would let me write the above in a more functional expression style:

fn foo<T, C>(x: C) where T:Add+Nullable, C:Enumerable<T> {
   x.fold(null, (prev: T, cur: T) -> if !prev: cur else prev + cur );
}

Or even more concise:

fn foo<T, C>(x: C) where T:Add+Nullable, C:Enumerable<T> =
   x.fold(null, (prev: T, cur: T) -> !prev ? cur : prev + cur )

But the possible encodings for the trait Add indicate the above code won't type check in my proposed hypothetical programming language. One possible encoding won't type check because T above has erased the data type(s) needed to compute Self:

trait Add {
   fn plus(&self, &Self) -> Self;
}

Tangentially note that afair a Self type can be modeled with generics in a language with subtyping lower bounds (e.g. Scala) even if Self is not a keyword:

trait Add[SELF <: Add[SELF]] {
   def +(x: SELF): SELF
}

And the other possible encoding won't type check because T above hasn't required the trait Coercion interface:

trait Add {
   fn plus<A:Coercion<Self>>(&self, &A) -> Self;
}

But to write the trait bound on T we need higher-ranked typing (HRT) of A over a specific type parameter T as follows, not forall A over specific scopes(1)(2)(3):

fn foo<T, A, C>(x: C) where T:Add+Nullable+Coercion<A forall T>, C:Enumerable<T>

What the above would mean is that every data type of T must implement trait Coercion<A> for every value of A that corresponds to every data type in T. Well I am happy I was able to design above what is ostensibly the solution to this issue. Anybody know what this sort of HRT is canonically named and which canonically name type system does it inhabit? Or have I potentially (assuming it is sound) invented an entirely new type system variant?

Tangentially note that I am contemplating whether the above proposal for handling coercions via scoped traits rather than hard-coded in the language (as is afaik the case currently for Rust) is much more flexible and correct. Rust is afaik also not enabling implementing operators with traits?

  1. High Order Function with Type Parameter - #8 by shelby3
  2. High Order Function with Type Parameter - #18 by keean
  3. Does rust really need higher kinded types? - #9 by shelby3
    (distinguishes HKT vs. HRT)

Another pedantic Rust syntax pet peeve of mine:

use std::f64::consts;
impl HasArea for Circle {
    fn get_area(&self) -> f64 {
        self.radius * consts::PI * 2f64
    }
}

I have an LL(k) grammar under development which checks so far with k=2 (i.e. one token lookahead) using the SLK parser generator, which enables me to use the reuse the . member access operator also the scoping operator which IMHO is much more attracted but it does require that scoping and type names begin with an uppercase letter while instance and function names may not:

impl HasArea for Circle {
    fn get_area(&self) -> f64 {
        self.radius * Std.F64.Consts.PI * 2f64
    }
}

Or:

use Std.F64.Consts;
impl HasArea for Circle {
    fn get_area(&self) -> f64 {
        self.radius * Consts.PI * 2f64
    }
}

To my eye, it is redundant to dictate a distinction between access and scoping because the uppercase restriction in my grammar makes it obvious and also naming is more consistent. And distinguishing access to methods which don't input &self is also obvious because type names begin with an uppercase letter and instance names do not. For me, language design is an artform and I want it to be elegant like a painting. Einstein also wanted his theories to be elegant, which is one reason he resisted quantum theory (and I think he will be proven correct).

The grammar I am developing employs Python/Haskell-like indenting instead of the more noisy, inconsistent readability (and vertical screen space wasteful) brace-delimited blocks, which also enable eliding the semicolons without incurring ambiguities.

Inline anonymous function syntax does exist:

let x = vec![1,2,3];
println!("{:?}", x.iter().fold(0, |sum, val| sum + val));
println!("{:?}", x.iter().fold(0, |sum, val| {
    // use {...} for multi-statement functions
    sum + val
}));

As for the ternary operator, it was removed in its infancy Remove ternary operator · Issue #1698 · rust-lang/rust · GitHub with a recently closed attempt to resurrect it Rust Needs The Ternary Conditional Operator (-?-:-) · Issue #1362 · rust-lang/rfcs · GitHub. I do favor its brevity, however.

Thanks. I was aware of that. I was also referring to needing to put return twice, having to use braces instead of =.

My personal 2 cents opinion is that eliding the -> in Rust's lambdas is confusing because it no longer visually states it is a function with the result on the right side of the arrow. Vertical bar means bitwise-or in most C/C++/Javascript/Java languages. The other use of | I've seen is in grammars to specify optional choices, so I think it might be acceptable to use | for writing union (disjunction) types instead of \/ or .

How will you determine how many dictionary slots to have in your 'box' type? I still think this reduces to the same problem as my solution using type classes, and hence I would rather not introduce union types when they don't seem necessary. I think we will have to explore the implementation to see if these two are equivalent, as I think. I already have an implementation plan. The module interface files generated by the compiler already have to include the trait bounds for traits, so we have all the information when compiling to compose bound 'extensions' at link time, it would just mean deferring the memory layout of boxed traits until all modules have been processed, and the final set of bounds accumulated.

Regarding the tangent, I think coercions are a bad thing, and I favour only explicit conversions (this also helps type inference, and results in less type annotations).

I am not bothered by ':' in type definitions. I tend to favour using lower-case for types and upper case for type variables like in a Prolog program, but I think this is down to personal taste. I don't like significant whitespace as it makes cutting and pasting code very difficult, and you cannot use code auto-layout tools to fix someone elses code to meet your own preferred styling.

However, I suspect any 'logical' arguments here are post-rationalisation. The great thing about creating your own language is you can do everything exactly as you want. The bad thing is that it splits the user base of existing languages. This is what lead to the creation of Haskell, originally there were many competing individual lazy-pure-functional languages, so all the different authors got together in a committee and created Haskell. I suspect that nobody is completely happy with the design of Haskell, but everyone can live with the compromises, and it results in a bigger community, and a common platform for further research.

I also thing imperative code is easier to read and more understandable than the functional style. This is due to the objects having stable types, and visible indexing. The functional style of chaining maps and folds results in invisible intermediate types that make reading someone else's code much harder. I call it "write only code" :slight_smile: As debugging is harder than writing code, and as code will be in maintenance for longer than it is in development, I think we should optimise code for readability and understandability. If you use all of your intelligence to write the code, then because debugging is harder, you will be by definition not clever enough to debug it. Complexity is the problem, and the solution is that we should strive for simplicity. Here I mean simplicity of concept, not simplicity of grammar or less typing.

1 Like

If you use lowercase for types, then you can't distinguish then from instance (aka variable) names except by context. In C++/Java/Scala, type parameters (aka variables) are usually all uppercase and type names are Camel case with first letter uppercase. Variable (and function/method) names are usually lower case or first letter lower case with Camel case for function names.

C was a very good language because it had a minimum of syntax. Symbol soup looks noisy. The dot operator can be thought of narrowing/accessing the type, so it makes sense we should use the dot operator for both type member access (since those members are part of the structural type even if we aren't employing structural typing else where in the language) and Type scoping is narrowing/accessing the nominal type.

Not necessarily. Did Python split any existing user base? Ruby's? Rather I think Python identified a niche for readable consistency.

If someone identifies a niche and hits that niche on the target, they can succeed.

So the key is identifying Rust's niche and focus on it.

Haskell's niche is apparently people who want to code like math. It isn't just that it is a functional language, but the lazy evaluation which enables expressing infinite series and other math.

I agree about "write only" code and as I wrote in another post today, this is likely because of Haskell's global type inference (which btw prevents a first-class union and thus Haskell can never implement my proposal in this thread).

Btw, afaics functional programming can be readable by annotating the code with the types. Afaics, readability can be higher in a functional style, if done well.

P.S. It is Sunday and I have an obligation to be at a beach party, so further replies will be delayed.

I don't disagree with this, but I think I want to distinguish between type variables and types too. Most of my coding does not involve concrete types, so having type variables distinguishable seems better to me.

I think they both took users away from Perl :slight_smile: There are only so many programmers so new languages must take away from somewhere.

Well more different types in an expression is worse, and having to add type annotations is worse. It also doesn't solve the indexing problem, and that is having variables for index lookup is much clearer than maps and folds especially when you have more than one index in play. See my permutations example in JavaScript I posted.

So to clarify, you now know you don't have to put return even once as in my second example? In case that wasn't as clear, omitting the final semicolon on any block returns the value of the last statement. E.g. equivalent to your first version:

fn foo<T, C>(x: C) where T:Add+Nullable, C:Enumerable<T> {
   fn f(prev: T, cur: T) {
      if prev == null { cur } else { prev + cur }
   }
   x.fold(null, f);
}

Omitting return when it's not needed is the preferred convention, I believe.

Rust's lambda syntax, I believe, was inspired by Ruby's block syntax, though it's slightly different from it (in Ruby, the |...| goes inside the braces, whereas in Rust it's outside.) Rust used to emulate Ruby's block-passing syntax but by using the do keyword. This syntax was removed in RFC: remove `do` · Issue #10815 · rust-lang/rust · GitHub