Design patterns for composability with traits (i.e. typeclasses)?


#52

I said appending when changing the union is immutable. Thus your example code above won’t typecheck as it requires a mutable reference but the method signature is:

Isn’t a collection you cannot change a bit useless? How can I do anything with it?

Afaics that isn’t sound, because this requires the linker to be a compiler. In short, you need to compile all the linked modules at the same time. There can’t be any more modularity in code. I could write some new code that breaks code that used to compile in another module.

No, its fine, it just needs the header files to list the types that implement each trait. The compiler automatically generates a header/interface for each file it compiles. This contains encoded versions of the type signatures of the public functions in a module (otherwise it could not type check when you import a module without repassing the source code). If we extend the interface generated with trait information, which types implement which traits, and which traits are bounds on which other traits, then the compiler can complete these checks in the link phase.

Edit: If you think about it, some, if not all, of this information must be there, so that when you use a module it knows which types inside that module implement which traits (whether defined in that module or another). This is done without recompiling the used modules (unless it has changed since the last compile).

In any case this is no less sound than your own suggestion which would be required to build the ‘union’ of types across all source modules. In effect the trait represents the union of types that implement that trait.

Remember soundness is the property of a type system to prevent programs that go wrong at runtime from passing the type checker. So with the unions, if you have to check whether a type in the collection implements the trait at runtime, your static type system is unsound.


#53

I don’t know why you repeat this incorrect assumption, when I have explained upthread that this is not the case presuming the type system is not allowing subtyping, which means per my example then the type system must enforce immutability when adding an item that has a new data type to the first-class union of the heterogeneous collection. Perhaps you meant in the case that immutability is not enforced by the type system, but as we showed upthread that wouldn’t be sound, so I wouldn’t create a type system that allows it.

Note even when instead of a first-class union of data types, we employ the conjunction of traits as the ValueType (which btw is not open to both dimensions of the Expression Problem as we explain upthread) then still can’t employ subtyping because Self<T /\ A> is not a subtype of Self<T>:

trait Vec<T> {
   fn appendItem(&mut self, T) -> Self<T>;
   fn appendItem<A>(&self, A) -> Self<T \/ A>;
   fn appendItem<A>(&self, A) -> Self<T /\ A>;  // Note this can't never be called any way, bcz a conjunction of traits can't be extended bcz we've erased the data types so compiler can't enforce if implementations exist in scope for the added traits
}

So subtyping will never be used in the proposed programming language. Subtyping doesn’t exist in Rust now, except for the corner case of lifetimes (which my abstract intuition leads me to believe I will eventually show Rust’s lifetimes is an anti-pattern model).

As I showed in my prior example, operations (including appending) which are not adding a new type to the first-class union VaueType (that is the element type of the heterogeneous collection), will type check with a mutable reference to the collection. And as I also showed in my prior example, operations (including appending) which are adding a new type to the first-class union VaueType, will not type check with a mutable reference to the collection.

Additionally, when appending an item which adds a new type to the first-class union VaueType, it is not required to not copy the entire collection in some cases of collection types, e.g. a List and other types of collections which are inherently encapsulated after append w.r.t. to the ValueType. It is even possible to make a DiffArray function like an array for random access, yet make it function like a List for immutability of ValueType. I had posted in 2011 a rough conceptual idea which sort of illustrates the conceptual strategies which can be employed to model mutability of arrays in an immutable context. The Clojure collections exhibit other strategies. There does appear to be at least a logarithmic performance penalty for immutability, but again note the distinction here is that I am only proposing to incur that penalty when adding a new data type to the union which is the ValueType (aka element type) of the collection. So with my proposal we get extension in both directions (dimensions) of the Expression Problem, yet we don’t prematurely erase the union of data types and pay no penalty for not erasing them, until if ever we need to do dependency injection extension along the data type direction of the Expression Problem by actually adding a new data type to preexisting element type of a preexisting collection which we couldn’t otherwise do (this dependency injection) in the current way Rust, Haskell, and PureScript (typeclass languages) are. Of course subtyping+subclassing languages such as C++, C#, Scala, OCaml, etc… are hopeless anti-patterns.


#54

It is difficult to explain where it is unsound because afaics you have not specified all the details such a design will require, e.g. how can the compiler convey to the linker that adding a data type to a collection was dependent on a change to the collection’s element type at a particular scope. Essentially you have just converted the linker into a compiler, which was my point before. Afaics there simply isn’t any way to remove the invariant that the data types that were already in the collection have been erased from the compiler’s AST by the time you need to add the new trait to conjunction which is the element type. I am 99.9% confident when you work through the details, you will discover it is unsound and can’t be fixed.


#55

How many do you want because the list of use case examples is unbounded. :wink:

For example the web browser’s DOM has collections and they are typed according to the known element types, but if we want to make a typed interface with a typeclass language (i.e. not the subtyping+subclassing anti-pattern). So if we want to extend on top of the DOM our own data types, but be interoperable with preexisting code, we could not do it without my proposal. Of course currently the DOM is not modeled with a typeclass language structure, so there are no such preexisting code, but I am speaking about the current way to do it once I’m done replacing the web browser (my current project goal which is why I need a new programming language).

Note again that modeling the DOM with the subtyping+subclassing anti-pattern also does not allow for extension on both directions (dimensions) of the Expression Problem.


#56

Can we talk about some code? How about this: http://is.gd/st52cv

We have 3 modules, one defines circle, one square and one a shapes collection and processing function. We can put the circle in the shapes collection because it implements HasArea. We can easily extend with the new type Shape that is defined in a separate module by defining an implementation of HasArea, all without changing any of the code in modules a, b & c. Good so far.

To extend in the other direction, lets say we want to get the colour of every shape in the collection, we can obviously define new implementations of HasColour for the shapes that need it (Circle) in the same way we did for HasArea, the problem is we are required to modify module c to change the collection trait.

We could fix this by allowing ‘main’ to extend the trait bounds of Shape, as I suggested. This will require some form of whole program analysis to work out the size of the Box (but I think it will however you try and solve this).

Perhaps it is time to think about implementation, why do we have to do this? Its because there needs to be space reserved in the box for each dictionary at runtime, so that the methods can be looked up. This works by constant offset so the dictionary is just an array of memory addresses for an indirect jump. So we need to know how many jump tables to include in the Box and this number needs to be fixed at compile time. How would your solution be implemented?


#57

We need both typings of the above interface, because when the data type(s) are erased as an input to a function which requires a (possibly union of) trait interface(s), e.g. Appendable or Enumerable, then there is no way to apply the first HKT typed interface because it requires the compiler knows the data type(s), e.g. Vec. For example, where C:Enumerable<T> is a type or possibly a union of types (such as Vec<T> \/ List<T>) that implement the trait Enumerable<T> in my proposed hypothetical programming language:

fn foo<T, C>(x: C) where T:Add+Nullable, C:Enumerable<T> {
   fn f(prev: T, cur: T) {
      if prev == null { return cur; }
      else { return prev + cur; }
   }
   x.fold(null, f);
}

Tangentially (btw), I wish Rust would let me write the above in a more functional expression style:

fn foo<T, C>(x: C) where T:Add+Nullable, C:Enumerable<T> {
   x.fold(null, (prev: T, cur: T) -> if !prev: cur else prev + cur );
}

Or even more concise:

fn foo<T, C>(x: C) where T:Add+Nullable, C:Enumerable<T> =
   x.fold(null, (prev: T, cur: T) -> !prev ? cur : prev + cur )

But the possible encodings for the trait Add indicate the above code won’t type check in my proposed hypothetical programming language. One possible encoding won’t type check because T above has erased the data type(s) needed to compute Self:

trait Add {
   fn plus(&self, &Self) -> Self;
}

Tangentially note that afair a Self type can be modeled with generics in a language with subtyping lower bounds (e.g. Scala) even if Self is not a keyword:

trait Add[SELF <: Add[SELF]] {
   def +(x: SELF): SELF
}

And the other possible encoding won’t type check because T above hasn’t required the trait Coercion interface:

trait Add {
   fn plus<A:Coercion<Self>>(&self, &A) -> Self;
}

But to write the trait bound on T we need higher-ranked typing (HRT) of A over a specific type parameter T as follows, not forall A over specific scopes(1)(2)(3):

fn foo<T, A, C>(x: C) where T:Add+Nullable+Coercion<A forall T>, C:Enumerable<T>

What the above would mean is that every data type of T must implement trait Coercion<A> for every value of A that corresponds to every data type in T. Well I am happy I was able to design above what is ostensibly the solution to this issue. Anybody know what this sort of HRT is canonically named and which canonically name type system does it inhabit? Or have I potentially (assuming it is sound) invented an entirely new type system variant?

Tangentially note that I am contemplating whether the above proposal for handling coercions via scoped traits rather than hard-coded in the language (as is afaik the case currently for Rust) is much more flexible and correct. Rust is afaik also not enabling implementing operators with traits?

  1. High Order Function with Type Parameter
  2. High Order Function with Type Parameter
  3. Does rust really need higher kinded types?
    (distinguishes HKT vs. HRT)

#58

Another pedantic Rust syntax pet peeve of mine:

use std::f64::consts;
impl HasArea for Circle {
    fn get_area(&self) -> f64 {
        self.radius * consts::PI * 2f64
    }
}

I have an LL(k) grammar under development which checks so far with k=2 (i.e. one token lookahead) using the SLK parser generator, which enables me to use the reuse the . member access operator also the scoping operator which IMHO is much more attracted but it does require that scoping and type names begin with an uppercase letter while instance and function names may not:

impl HasArea for Circle {
    fn get_area(&self) -> f64 {
        self.radius * Std.F64.Consts.PI * 2f64
    }
}

Or:

use Std.F64.Consts;
impl HasArea for Circle {
    fn get_area(&self) -> f64 {
        self.radius * Consts.PI * 2f64
    }
}

To my eye, it is redundant to dictate a distinction between access and scoping because the uppercase restriction in my grammar makes it obvious and also naming is more consistent. And distinguishing access to methods which don’t input &self is also obvious because type names begin with an uppercase letter and instance names do not. For me, language design is an artform and I want it to be elegant like a painting. Einstein also wanted his theories to be elegant, which is one reason he resisted quantum theory (and I think he will be proven correct).

The grammar I am developing employs Python/Haskell-like indenting instead of the more noisy, inconsistent readability (and vertical screen space wasteful) brace-delimited blocks, which also enable eliding the semicolons without incurring ambiguities.


Remove braces from `if-else` where a single-line ternary could otherwise be employed?
#59

Inline anonymous function syntax does exist:

let x = vec![1,2,3];
println!("{:?}", x.iter().fold(0, |sum, val| sum + val));
println!("{:?}", x.iter().fold(0, |sum, val| {
    // use {...} for multi-statement functions
    sum + val
}));

As for the ternary operator, it was removed in its infancy https://github.com/rust-lang/rust/issues/1698 with a recently closed attempt to resurrect it https://github.com/rust-lang/rfcs/issues/1362. I do favor its brevity, however.


#60

Thanks. I was aware of that. I was also referring to needing to put return twice, having to use braces instead of =.

My personal 2 cents opinion is that eliding the -> in Rust’s lambdas is confusing because it no longer visually states it is a function with the result on the right side of the arrow. Vertical bar means bitwise-or in most C/C++/Javascript/Java languages. The other use of | I’ve seen is in grammars to specify optional choices, so I think it might be acceptable to use | for writing union (disjunction) types instead of \/ or .


#61

How will you determine how many dictionary slots to have in your ‘box’ type? I still think this reduces to the same problem as my solution using type classes, and hence I would rather not introduce union types when they don’t seem necessary. I think we will have to explore the implementation to see if these two are equivalent, as I think. I already have an implementation plan. The module interface files generated by the compiler already have to include the trait bounds for traits, so we have all the information when compiling to compose bound ‘extensions’ at link time, it would just mean deferring the memory layout of boxed traits until all modules have been processed, and the final set of bounds accumulated.

Regarding the tangent, I think coercions are a bad thing, and I favour only explicit conversions (this also helps type inference, and results in less type annotations).


#62

I am not bothered by ‘:’ in type definitions. I tend to favour using lower-case for types and upper case for type variables like in a Prolog program, but I think this is down to personal taste. I don’t like significant whitespace as it makes cutting and pasting code very difficult, and you cannot use code auto-layout tools to fix someone elses code to meet your own preferred styling.

However, I suspect any ‘logical’ arguments here are post-rationalisation. The great thing about creating your own language is you can do everything exactly as you want. The bad thing is that it splits the user base of existing languages. This is what lead to the creation of Haskell, originally there were many competing individual lazy-pure-functional languages, so all the different authors got together in a committee and created Haskell. I suspect that nobody is completely happy with the design of Haskell, but everyone can live with the compromises, and it results in a bigger community, and a common platform for further research.

I also thing imperative code is easier to read and more understandable than the functional style. This is due to the objects having stable types, and visible indexing. The functional style of chaining maps and folds results in invisible intermediate types that make reading someone else’s code much harder. I call it “write only code” :slight_smile: As debugging is harder than writing code, and as code will be in maintenance for longer than it is in development, I think we should optimise code for readability and understandability. If you use all of your intelligence to write the code, then because debugging is harder, you will be by definition not clever enough to debug it. Complexity is the problem, and the solution is that we should strive for simplicity. Here I mean simplicity of concept, not simplicity of grammar or less typing.


#63

If you use lowercase for types, then you can’t distinguish then from instance (aka variable) names except by context. In C++/Java/Scala, type parameters (aka variables) are usually all uppercase and type names are Camel case with first letter uppercase. Variable (and function/method) names are usually lower case or first letter lower case with Camel case for function names.

C was a very good language because it had a minimum of syntax. Symbol soup looks noisy. The dot operator can be thought of narrowing/accessing the type, so it makes sense we should use the dot operator for both type member access (since those members are part of the structural type even if we aren’t employing structural typing else where in the language) and Type scoping is narrowing/accessing the nominal type.

Not necessarily. Did Python split any existing user base? Ruby’s? Rather I think Python identified a niche for readable consistency.

If someone identifies a niche and hits that niche on the target, they can succeed.

So the key is identifying Rust’s niche and focus on it.

Haskell’s niche is apparently people who want to code like math. It isn’t just that it is a functional language, but the lazy evaluation which enables expressing infinite series and other math.

I agree about “write only” code and as I wrote in another post today, this is likely because of Haskell’s global type inference (which btw prevents a first-class union and thus Haskell can never implement my proposal in this thread).

Btw, afaics functional programming can be readable by annotating the code with the types. Afaics, readability can be higher in a functional style, if done well.

P.S. It is Sunday and I have an obligation to be at a beach party, so further replies will be delayed.


Does rust really need higher kinded types?
#64

I don’t disagree with this, but I think I want to distinguish between type variables and types too. Most of my coding does not involve concrete types, so having type variables distinguishable seems better to me.

I think they both took users away from Perl :slight_smile: There are only so many programmers so new languages must take away from somewhere.

Well more different types in an expression is worse, and having to add type annotations is worse. It also doesn’t solve the indexing problem, and that is having variables for index lookup is much clearer than maps and folds especially when you have more than one index in play. See my permutations example in JavaScript I posted.


Does rust really need higher kinded types?
#65

So to clarify, you now know you don’t have to put return even once as in my second example? In case that wasn’t as clear, omitting the final semicolon on any block returns the value of the last statement. E.g. equivalent to your first version:

fn foo<T, C>(x: C) where T:Add+Nullable, C:Enumerable<T> {
   fn f(prev: T, cur: T) {
      if prev == null { cur } else { prev + cur }
   }
   x.fold(null, f);
}

Omitting return when it’s not needed is the preferred convention, I believe.

Rust’s lambda syntax, I believe, was inspired by Ruby’s block syntax, though it’s slightly different from it (in Ruby, the |…| goes inside the braces, whereas in Rust it’s outside.) Rust used to emulate Ruby’s block-passing syntax but by using the do keyword. This syntax was removed in https://github.com/rust-lang/rust/issues/10815


#66

Sorry to be pedantic, and yes I was aware of that special rule and frankly I think that is silly that only don’t need to put a semicolon on the last block that is the result expression. I’d prefer never put semicolons and the result expression would also be the value of the last expression in the block, or () if the last statement in the block is not an expression, e.g. a while block.

I knew that return was optional on the last line, but I didn’t know if Rust was capable of recognizing if-else as an expression (and I actually thought of checking it since Scala can, but…). I don’t make such assumptions (and hadn’t bothered to try it because I like general, consistent rules so I don’t need to try it in order to not be surprised), because of the special case rules instead of a general one that everything is a expression (and not statement which I think one of the reasons Scala didn’t include while) or my generalized variant (to allow for while statements).

The 2 cents issue for me with lambdas is consistency. Since -> is used to indicate result type on functions, it would be more consistent IMO if it is also used to indicate the result expression for lambdas (anonymous functions). Here I must also criticize Scala for using : for function result types (consistent with function arguments yet…) and => for lambdas (which is non-standard from the -> of Haskel et al, and appears to be an attempt at consistency with = for function blocks that are expressions instead of brace-delimited blocks).

I am just arguing that lack of special cases, makes memorizing and learning a language less surprising.


#67

Oh, I didn’t find it pedantic at all, and I’m glad you were aware. Your specific before and after examples left me guessing as to the precise points of confusion, so I tried to make sure I addressed the ones I guessed.

Regarding syntactical changes — you were comparing the lambda syntax to other languages; I only meant to provide you context for this choice (most likely Ruby), not argue for or against any specific syntax.

That being said, a few newcomers have previously argued for changes to syntax post rust stabilization and the invariable answer, understandably, from the development team has been that the time for that bikeshedding is long past. Rust has a lot of important things in its roadmap to develop toward and backward-incompatible syntactical changes on a stabilized language would be a tremendous burden to the community.

In any case, if you feel the need to argue for it further, perhaps a new thread would be more appropriate.


#68

Thanks also.

Note I am just expressing (my “2 cents”) what I would do differently and what the motivations might be for me to maybe create my own programming language, rather than adopt Rust for my current targeted need:

Note also, that I am interested in identifying Rust’s target focus, so I might (and others do apparently) find a use case for Rust:


#69

I believe we are in agreement?


#70

Example, the following I see the pattern :: more than I see the names (which can be alleviated somewhat with syntax coloring and setting the :: to display in low contrast to the background):

I’d prefer to write that as follows where the . almost disappears:

  • Vec.iter() gives a Std.Slice.Iter
  • Vec.iter_mut() gives a Std.Slice.IterMut
  • Vec.into_iter() gives a Std.Vec.IntoIter

Two dots : is less (½ as?) distracting than four ::, but still IMO more (twice as?) distracting as one .

Perhaps different people visualize code differently. I seem to have a strong visual pattern matching which is difficult for me to turn off, thus the :: is akin to a lawn mower outside while I am trying to listen to music. I don’t seem to be autistic, but perhaps that is a slightly autistic type of skill/handicap although I wonder if others have the same distraction. Note it is not incredibly distracting and I am being perfectionist/pedantic with this comment. What happens is I need to make a conscious effort to read it versus just having it sort of instantly absorbed (wherein I am reading it but it is probably like driving where we do it unconsciously because of muscle memory).

But note I don’t prefer eliding punctuation which makes the code less readable, e.g. arguably Haskell’s elision of parenthesis around function arguments and spaces instead of comma-delimited list of arguments. Haskell is requiring me to lookup or memorize the function declarations so I know how many arguments each function consumes.

Edit: perhaps it is more distracting because combined with doubling a colon, a single colon has a normal meaning in English, math, and computing such that it shouldn’t be used as a delimiter, and especially given the delimited names can be as short as 2 or 3 characters in length. The valid use of a colon in English and computer languages (other than functional programming Cons) is to introduce a part of a sentence that is a logical continuation of the part before it. And for math, predominately to mean “such that”. Thus it doesn’t make sense to have colon-delimited list, since there should only be one part before and aft. Whereas, historically the single dot period has been a delimiter in computing (not just computer languages).


#71

I agree with much of the logic against including the ternary operator, but only if if-else isn’t forcing that noisy brace-delimited syntax:

if !prev { cur } else { prev + cur }

I’d prefer:

if !prev: cur else prev + cur

Or even more clear:

if !prev then cur else prev + cur

Than:

!prev ? cur : prev + cur

But remove syntax coloring and assume we are using a plain text editor:

if !prev { cur } else { prev + cur } <— this style isn’t available in a Haskell/Python block indenting instead of brace-delimited

if !prev: cur else prev + cur

if !prev then cur else prev + cur

Then the English text variants render low contrast confusing the keywords and the adjacent juxtaposed fragments of code, thus we either need the brace-delimited or the operator delimited ternary:

!prev ? cur : prev + cur

Edit: I added a comment at the issue tracker linking to this post.

Edit#2: I’m thinking the best would be to force only ternary on single or double line if and if-else:

!prev ? cur
!prev ? cur : prev + cur

!prev ? cur
      : prev + cur

But I’m not sure if I prefer that compared to:

if !prev: cur
if !prev: cur else prev + cur
if !prev: cur
else prev + cur

Or:

if (!prev) cur
if (!prev) cur else prev + cur
if (!prev) cur
else prev + cur

Or:

if !prev then cur
if !prev then cur else prev + cur
if !prev then cur
else prev + cur

Then for multi-line blocks either brace-delimited or my preferred indented delimited with keywords:

if !prev {
   cur
}
else {
   prev + cur
}
if !prev
   cur
else
   prev + cur

Remove braces from `if-else` where a single-line ternary could otherwise be employed?