keean/zenscript

Subtyping

Opened this issue · 1023 comments

Union types express a subtyping relationship, but I am unclear as to whether typeclasses (i.e. Rust's traits) do?

If a trait B extends another trait A and B reuses the implementations of A, can we assign a trait object that has a bound B to a trait object that has a bound A?

Seems the answer based on prior discussion is yes. But that is a subtyping relationship, which means we would need to deal with covariance on type parameters both when they are trait objects and when they are unions. Correct?

Prior discussion:
#6 (comment)
#1 (comment)
#2 (comment)
#1 (comment)

@keean wrote:

It will be interesting to see if we still need-trait objects

They are the only way (other than subclassing) I can see to get variance in the polymorphism, i.e. we can assign new types into the trait object (Rust's name for an "existential type" with a typeclass bound) without breaking the invariance of the Array.

The advantage of trait objects compared to subclasses, is that types that can be inserted into a trait object don't have to conform to a pre-existing subclass hierarchy, i.e. it solves some aspect of the Expression Problem¹ but doesn't solve the scenario that unions solve. I am thinking we can view trait objects and unions as complementary in functionality offering different trade-offs on the multi-dimensional design space.

  1. Specifically when we use trait objects for polymorphism, we can add new operations to existing data types (i.e. the trait bound of the trait object), and we can add new data types to existing operations (i.e. implement data types for existing traits), but we can not add new operations to an existing trait object (because the data types have been erased from the static compile-time knowledge of the trait bound type of the reference when assigned to the trait object). And that is the limitation of trait objects that my solution addresses with unions and delayed binding of the union to a trait bound at the function call site. OTOH, unlike unions employed in my solution, trait objects do not require invariant containers in order to add heterogeneous data types to the container because the type of the trait object is invariant. An invariant container is for example a List. Trait objects would also work with an Array which is not an invariant data structure. Apology I am not employing academic terminology such as kinds, rank, F-bounded, etc.. Someone else can do a better job of translating my solution into that terminology.
keean commented

@shelby3 I am thinking along the same lines. As far as I understand it there is no subtyping. With type classes when we say trait B extends trait A we mean that any implementation of trait B for a type requires there to also be an implementation of trait A, but that is it.

This applies to trait-bounds (type constraints) and trait-objects. Trait-objects are not really a valid type-system concept, these are more correctly called existential types. In haskell if we have an existential data type like this:

data Singleton = forall a. (TraitB a) => Singleton a

The forall a introduces a new scoped type variable, and because its scope is the container, we cannot ever from outside the container know the type of a, but it is constrained to implement trait B and that implies it also must implement trait A. This means we can call any function from the interfaces trait A and trait B on the contents of the container. An alternative way of writing the above (not valid Haskell) is:

data Singleton = Singleton (exists a . (TraitB a) => a)

which is where the term 'existential type' comes from. Generally forall corresponds to the logical concept of any (but may be none), and exists corresponds to the logical concept of some (but at least one).

@keean wrote:

I didn't see any mention of type-classes when I looked at Ceylon. It also looks like Ceylon provides classical object inheritance and subtyping, which are both things ZenScript aims to avoid to keep the language small and simple.

Don't we need to be careful about differentiating between subclassing versus subtyping?

ZenScript will have subtyping because it will offer structural unions. An Int is a subtype of an Int|String. ZenScript will not have subclassing.

Ceylon has anonymous structural unions, but it doesn't have typeclasses. <truthful hyperbole>Also it has that Jurassic, ossifying, rigor mortise, anti-pattern subclassing paradigm, which will infect with that said virus any program written with Ceylon. Ditto every statically typed language that offers subclassing including Java and Scala (because programmers won't be disciplined enough to not use the subclassing virus in their design patterns).</truthful hyperbole> 😜


Update:

@naasking wrote:

Ceylon models Java's subtyping via first-class union and intersection types. It's not at all a classical subtyping model.

He replied as quoted above after I wrote the above. Please don't use the word 'subtyping' where you really mean 'subclassing'.

@naasking is apparently unaware of what the programmer can't accomplish with unions and intersections in terms of the Expression Problem if the language does not have the delayed binding of implementation to interface that typeclasses provide.

A typeclass's type parameters can't express genericity that doesn't specialize on implementation. Thus we can't for example parametrize a typeclass trait List<A> to track the type of the elements stored in the list.

Thus we must use a data type to express this:

data List a = Nil | Cons a (List a)

Or add member property names if don't want to be forced to destructure with pattern matching:

data List a = Nil | Cons { head :: a, tail :: (List a) }

In ZenScript perhaps:

data List<A> = Nil | Cons(A, List<A>)

Or perhaps:

interface List<A>
singleton Nil implements List<Never>  // Never is¹ the bottom ⊥ type
sealed class Cons<A>(head: A, tail: List<A>) implements List<A>

Note afaik, Haskell's data type expresses a 'hasA' not an 'isA' subclassing relationship between Cons and List and a can not be heterogeneous because Haskel doesn't have a first-class union type (their global inference doesn't allow for it). If we use this syntax in ZenScript then when instantiating a Cons("string", Cons(1, Nil)) then the type will be inferred List<Number|String>. And if we instantiate first a let x = Cons(1, Nil) its type will be inferred List<Number>. And if we then instantiate Cons("string", x) then the type will be inferred List<Number|String>.

But note even though the above is subtyping, there are no virtual methods on data types, thus no virtual inheritance and no subclassing. Of course any function can input a data type, so that is synonymous with a method, except it isn't virtual dispatch.


  1. Since an eager, CBV language makes Bottom an effect, I have argued the name of the Bottom type should be Never and not Nothing (which appears to be the choice TypeScript made). Bottom is for example the type of functions that never return (terminate). Whereas, in a lazy, CBN language such as Haskell, Bottom becomes a value, so I argue it should be named Nothing and not Never.
keean commented

Or, with types

data Nil = Nil
data Cons<A, B> = Cons<A, B>
trait List A
impl List Nil
impl<A, B : List<A>> List Cons<A, B>
keean commented

As a single type with multiple constructors (tagged union):

data List<A> = Nil | Cons<A, List<A>>

I'm still not sure about which keywords we should be using for trait, impl and data.

@keean wrote:

data Cons<A, B> = Cons<A, B>

data List<A> = Nil | Cons<A, List<A>>

That seems incorrect. It doesn't tell me how to construct an instance of Cons. Mine was correct:

data List<A> = Nil | Cons(A, List<A>)

And we can add member property names if we don't want to be forced to employ pattern matching to destructure:

data List<A> = Nil | Cons(head: A, tail: List<A>)

@keean wrote:

data Nil = Nil
data Cons<A, B> = Cons<A, B>
trait List A
impl List Nil
impl<A, B : List<A>> List Cons<A, B>

That seems incorrect. I think it should be instead:

data Nil = Nil
data Cons<A> = Cons(A, List<A>)  // Edit: `List` is a typo and should be `Cons` per subsequent discussion
trait List
impl List Nil
impl<A> List Cons<A>

Remember the trait List should know nothing about A if its methods don't specialize on A.

@shelby3 wrote:

A typeclass's type parameters can't express genericity that doesn't specialize on implementation. Thus we can't for example parametrize a typeclass trait List<A> to track the type of the elements stored in the list.

Note I'd prefer to write that:

pluggable List
implement Nil for List
implement Cons<A> for List   // no need to write the type parameters twice per my
                             // proposal¹ that all UPPERCASE names are type parameters

Or much better:

pluggable List
Nil implements List
Cons<A> implements List

I think the last is best because it remains very similar to Java, yet we change the meaning of what is being implemented from interface ('isA' relationship) to pluggable ('hasA' relationship).

Thinking about a typeclass as a pluggable interface seems very intuitive. We can't use Rust's trait because trait has a different meaning in several other languages.

Q: "What is a pluggable API?"
A: "It means that you can replace the implementation."

Remember we both decided that clarity trumps brevity (e.g. implement instead of impl), especially for syntax which is not expression-level (because such declarations won't appear often because most frequently appear in source code are expressions).

  1. #6 (comment)
keean commented

Not quite :-) some things to discuss. You have given value constructors round brackets, that seems okay to me.

data List<A> = Nil | Cons(head: A, tail: List<A>)

Normally the arguments to cons are positional like function arguments, and deconstructed by pattern matching. You would use record syntax to name them, so either of the following:

data List<A> = Nil | Cons(A, List<A>)
data List<A> = Nil | Cons {head: A, tail: List<A>}

We don't have to stick to that but it's how I was thinking.

This has more problems:

data Nil = Nil
data Cons<A> = Cons(A, List<A>) // list is not a type
trait List
impl List Nil
impl<A> List Cons<A> // cons needs two type parameters

So correcting this:

data Nil = Nil
data Cons<A, B> = Cons(A, B) // constraints on data bad.
trait List<A>
impl List<A> for Nil
impl<B : List<A>> List<A> Cons<A, B>

Note this is still Rust syntax that gives special treatment to the first type class parameter, and I am not sure that is best, but let's have a different topic for that when we have agreed this.

@keean wrote:

You have given value constructors round brackets, that seems okay to me.

Yeah to differentiate them from type constructors, and because value constructors in the Java-like languages use round brackets (aka parenthetical grouping).

Normally the arguments to Cons are positional like function arguments, and deconstructed by pattern matching. You would use record syntax to name them, so either of the following:

You are repeating what I wrote. I even linked to the Haskell record syntax upthread.

However the following is not naming the members of Cons (rather is only providing their positions and types), and can only be destructured with pattern matching as I already wrote in my prior comment:

data List<A> = Nil | Cons(A, List<A>)

And to stick with the Java-like syntax (and not mixing in Haskell syntax), I would prefer the following which I think will be much more clear to mainstream programmers coming from popular programming languages:

data List<A> = Nil | Cons(head: A, tail: List<A>)

The following is mixing a JavaScript unnamed Object with some new concept of a tag of Cons, which has no analogous concept to people using JavaScript or OOP languages (and our guiding principle is not to introduce unnecessary syntax, i.e. the new concept of a { ... } where we don't need to):

data List<A> = Nil | Cons {head: A, tail: List<A>}

@keean wrote:

This has more problems:

data Nil = Nil
data Cons<A> = Cons(A, List<A>) // list is not a type
trait List
impl List Nil
impl<A> List Cons<A> // cons needs two type parameters

I had a typo and Cons does not need two type parameters (two would mess up other things):

data Nil = Nil
data Cons<A> = Cons(A, Cons<A>)
trait List
impl List Nil
impl<A> List Cons<A>

The type of the type parameter A in Cons<A> will be subsumed to the GLB of the union of two types used to construct a Cons. I had already explained that as follows.

@shelby3 wrote:

If we use this syntax in ZenScript then when instantiating a Cons("string", Cons(1, Nil)) then the type will be inferred List<Number|String>. And if we instantiate first a let x = Cons(1, Nil) its type will be inferred List<Number>. And if we then instantiate Cons("string", x) then the type will be inferred List<Number|String>.

Note in the above quoted text, I was referring to a data type List not a typeclass List. Refer to that quoted comment for the declaration I employed there (which differs from the List in this comment).


@keean wrote:

So correcting this:

That is still incorrect. You have a type parameter A on typeclass List which I already explained (and you even agreed!) is incorrect as follows.

@shelby3 wrote:

Remember the trait List should know nothing about A if its methods don't specialize on A.

@shelby3 wrote:

A typeclass's type parameters can't express genericity that doesn't specialize on implementation. Thus we can't for example parametrize a typeclass trait List<A> to track the type of the elements stored in the list.

Follow the link in the above quote to see where you had agreed. In fact, you were the one who explained the issue to me. And now it seems you forget what you explained to me.

Actually the above is revealing a deeper issue to me about higher-kinds which I had realized when I woke up this morning. I am preparing to write about that.

keean commented

The {} have the same use in 'C' for structs, C++ and Java for object definition so they are not new as such.

In C, C++ and Rust we would write:

struct Cons { 
    head : Int, // this syntax is different for C
}

Which we are writing:

data Cons = Cons {
    head : Int
}

Or

data Cons = Cons (
    head : Int
)

I am happy with either, providing the field names are optional, but I wanted to point out that the data statement can be viewed as an extension of struct and object definition.

keean commented

@shelby3 wrote

I had a typo and Cons does not need two type parameters (two would mess up other things):

Yes it does, what you wrote cannot ever end in a Nil.

keean commented

@shelby3 this version is correct in Rust syntax:

data Nil = Nil
data Cons<A, B> = Cons(A, B) // constraints on data bad.
trait List<A>
impl List<A> for Nil
impl<B : List<A>> List<A> Cons<A, B>

Cons needs a second type parameter because B can either be another Cons or a Nil which are different types.

The trait List needs a type parameter for the type that in the list, which is not the same as the type which is a member of the class (that is Nil or Cons<A, B>)

When we add Cons to the List type class we need to constrain B to be in the List type class so that you cannot put any random type as the second Cons parameter.

keean commented

@shelby3 wrote:

A typeclass's type parameters can't express genericity that doesn't specialize on implementation. Thus we can't for example parametrize a typeclass trait List to track the type of the elements stored in the list

I guess I was wrong, a multi parameter type class can represent an arbitrary relation on types. I must have been sleepy when I agreed :-)

@keean wrote:

Yes it does, what you wrote cannot ever end in a Nil.

Yup. Your idea was fundamentally flawed. We can't express a generic List type as typeclass interface. You'd have to implement the List for every possible data type you can put into a List which is the antithesis of a generic List. I was responding that yours was incorrect and I was demonstrating that if I try to write it correctly, I can't.

That is why I had written (before you commented with the erroneous idea) the correct way to define a generic List as follows.

@shelby3 wrote:

In ZenScript perhaps:

data List<A> = Nil | Cons(A, List<A>)

Or perhaps:

interface List<A>
singleton Nil implements List<Never>  // Never is¹ the bottom ⊥ type
sealed class Cons<A>(head: A, tail: List<A>) implements List<A>

@keean wrote:

A typeclass's type parameters can't express genericity that doesn't specialize on implementation. Thus we can't for example parametrize a typeclass trait List to track the type of the elements stored in the list.

I guess I was wrong, a multi parameter type class can represent an arbitrary relation on types. I must have been sleepy when I agreed :-)

You weren't wrong the first time. It makes no sense to specialize the List on every data type we can add to the List.

keean commented

Sorry you are wrong here. You can express a list as a type class and I have done it in Haskell and Rust. The HList paper I co-wrote with Oleg Kiselyov makes extensive use of this.

@keean wrote:

In C, C++ and Rust we would write:

struct Cons { 
    head : Int, // this syntax is different for C
}

Seems I recall that in the early days of C, it was only possible to use typedef to give a name to a struct.

I forget about struct because I rarely code in C any more (and C++ I haven't touched since I stopped coding CoolPage in 2002). And when I think about struct from C, I don't think in terms of a language with objects and high-order typing concepts, since Java, Scala don't have struct. So I guess that is why I didn't relate it. And afaik, { ... } in JavaScript is not tagged with a name, e.g. Cons.

@keean wrote:

Sorry you are wrong here. You can express a list as a type class and I have done it in Haskell and Rust. The HList paper I co-wrote with Oleg Kiselyov makes extensive use of this.

I will need to review this, so I can comment meaningfully. Where may I read the most succinct example which shows how I won't have to specialize the typeclass list for every data type I want to put into the list?

I presume a lot of HList boilerplate again?

Edit: I suppose the point I am making is that we are trying to eliminate boilerplate for ZenScript. If you are expecting mainstream programmers to use HList, I doubt it. But I need to review the examples before I can comment not just from guessing. The link above is I think probably instructive about this.

keean commented

In Haskell this:

data Nil = Nil
data Cons a b = Cons a b
class List a
instance List Nil
instance (List b) => List (Cons a b)

We can look at Peano numbers as another example:

data Z = Z
data S x = S x
class Nat n
instance Nat Z
instance (Nat a) => Nat (S a)

Note the main difference between Haskell type classes an rust traits syntactically is a rust trait has a concept of 'self' but Haskell does not. You can liken this to function syntax:

x.f(y) // object special syntax like Rust
f(x, y) // all parameters equal, better for multiple dispatch

Likewise with type classes rust makes the first type parameter special, so the Peano numbers above become:

struct Z {}
struct<A> S (A) // Rust tuple syntax
trait Nat // note no type parameter
impl Nat for Z
impl<A : Nat> Nat for S<A>

@shelby wrote:

You weren't wrong the first time. It makes no sense to specialize the List on every data type we can add to the List.

.

I presume a lot of HList boilerplate again?

Edit: I suppose the point I am making is that we are trying to eliminate boilerplate for ZenScript.

I expect you are taking what should be an orthogonal concept of a generic list and binding it to the data type in the list, and then using some boilerplate scaffolding to simulate genericity? This appears to be the basic theme of HList concepts as far as I can discern thus far (I may be wrong?), to sidestep a weakness in the type system and simulate type system in code with scaffolding?

I had started to sniff a problem yesterday. I was starting to realize we probably have an unsolved problem in the design incorporating first-class anonymous structural unions.

@keean wrote:

In Haskell this:

data Nil = Nil
data Cons a b = Cons a b
class List a
instance List Nil
instance (List b) => List (Cons a b)

We need to remember that Haskell does not allow heterogeneous unions, because I've read that at least it would break the global inference of Haskell.

Thus afaik in the above b will also be the same as a | Nothing which is just a where Nothing is at the top of all types (because Haskell's call-by-name type system is inverted so we use Bottom type where we would use Top type in a call-by-value language1).

So afaics, that is not specializing the List typeclass for every data type a that can be put into the List because only one homogeneous type can be put into any list object due to Haskell's type system restrictions (lack of a first-class anonymous structural union type). There will ever be only two implementations (aka instances) of List: Nil and List (Cons a b) where b is (List a) | Nil and Nil is List Nothing.

I presume the same for Rust, but bottom type instead of top.

But for ZenScript we are proposing to support heterogeneous lists, so I am trying figure out now what changes and what the problems and challenges are. I am thinking we will need higher-kinded types and there may be other problems.

I suppose you are saying we can simulate heterogeneous lists with HList concepts, but the point of the first-class union was to eliminate that boilerplate and make everything more extensible as I had attempted to explain/discuss at the Rust forum:

I presume a lot of HList boilerplate again?

It is possible you didn't realize how extensively I wanted to use the first-class unions. Perhaps you were thinking we'd be using HList concepts instead?

We have design quagmire now. I am trying to get my mind wrapped around it. I am suspecting we have failure now in my concept, but I need to confirm by understanding this design quagmire fully.


  1. Something I published at the now defunct copute.com in 2011 when I was teaching myself some type theory (can still be found on archive.org):

    Inductive and coinductive types are categorical duals (if they produce the same objects in reversed partial-order), because inductive and coinductive category morphism functions have reversed directions[8]. The initial fixedpoint must be the least in the partial-order, thus inductive types have objects which are the output of a unique morphism function (i.e. the algebra recursively) that inputs the initial fixedpoint. Dually, the final object must be the greatest in the partial-order, thus the coinductive types have objects which are the side-effect "output" of a unique morphism function (i.e. the coalgebra recursively), which terminates with the final object when an object of the type is destructed.

    Since Monad and Comonad are categorical duals, they compose on outputs or inputs respectively.

    [8] Declarative Continuations and Categorical Duality, Filinski, section 1.3.1 Basic definitions.

Even more edits to my prior comment. I am suspecting potential failure of my design concept. :hurtrealbad: 😭

@keean wrote:

We can look at Peano numbers as another example:

Haskell doesn't have any inductive types, thus it doesn't have (the type of) Peano numbers.

We have to be careful when using Haskell's coinductive call-by-name type system where laziness-by-default makes non-termination a value, as a model for an inductive type system with eager evaluation strategy by default that adds first-class unions. Many aspects appear to change.

keean commented

A couple of things.

First I think we should support multi-parameter type classes (with type-families, aka associated types) in full. People do not have to use the full power of this, but I don't want a false ceiling to the abstractions we can build.

Haskell has iso-recursive types (not equi-recursive) so it does have a kind of inductive type, thus Peano numbers work in Haskell :-) I can go on to define addition, subtraction, full arithmetic in the type system. Using polymorphic recursion you can even convert a value to a type for faked dependent types, but I don't think we should support this... That why we are using parametric types not universally quantified types.

So in our system those Haskell type Peano numbers have to be statically determined (effectively known at compile time). They cannot support runtime polymorphism without combining with existential types.

We are discussing different ways to do things, first-class union types give us:

type List<a> // a patial declaration which we need to tie the knot
data Nil = Nil
data Cons<a> = Cons(head: a, tail: List<a>)
type List<a> = Nil | Cons<a> // tie the knot, note the RHS are type in a 'type' declaration
keean commented

Regarding lazyness, nothing above changes with regard to the type system, all the Haskell types can be annotated with strictness annotations to make them strict. The type system has to be evaluated at compile time. Non termination of typing means the program won't compile, it has nothing to do with lazyness at runtime. All the examples I have given work fine in Rust which is eager.

@keean wrote:

all the Haskell types can be annotated with strictness annotations to make them strict.

Stictness annotations don't remove Bottom populated in the type because it is a fundamental fact that non-termination is a value in a lazy language. Bottom is a value (and populated in every type) in Haskell. Whereas, in an inductive language, Bottom is never instantiated.

keean commented

@shelby3 wrote:

Stictness annotations don't remove bottom populated in the type because it is a fundamental fact that non-termination is a value in a lazy language.

It doesn't matter. The types are valid in both lazy and eager languages. In Haskell every type contains 'bottom', and in Rust and eager languages they do not. Everything else is the same.

The other point is I think it is worth avoiding objects all together, the recursive typing from 'Self' makes things really complicated to understand (see Scala). I think if you have records ('C' structs') and type classes, you wont miss objects at all.

Edit: Of course if we decide to have objects that's fine, but I would like to make the case for not having them, which I think works better with multi-parameter dispatch. Modules take over the namespacing part of objects. New issue here: #9

@keean wrote:

It doesn't matter.

Haskell doesn't have disjunctive coproducts aka categorical sums.

I wrote at my copute.com circa 2011:

Trade-offs

CBV and CBN are categorical duels[10] (see also), because they have reversed evaluation order, i.e. whether the outer or inner functions respectively are evaluated first. Imagine an upside-down tree, then CBV evaluates from function tree branch tips up the branch hierarchy to the top-level function trunk; whereas, CBN evaluates from the trunk down to the branch tips. CBV doesn't have conjunctive products ("and", a/k/a categorical "products") and CBN doesn't have disjunctive coproducts ("or", a/k/a categorical "sums")[9].

↑ Non-termination

At compile-time, functions can't be guaranteed to terminate.

↑ Eager
With CBV but not CBN, for the conjunction of Head "and" Tail, if either Head or Tail doesn't terminate, then respectively either List( Head(), Tail() ).tail == Tail() or List( Head(), Tail() ).head == Head() is not true because the left-side doesn't, and right-side does, terminate.

Whereas, with CBN both sides terminate. Thus CBV is too eager with conjunctive products, and non-terminates (including runtime exceptions) in those cases where it isn't necessary.

↑ Lazy
With CBN but not CBV, for the disjunction of 1 "or" 2, if f doesn't terminate, then List( f ? 1 : 2, 3 ).tail == (f ? List( 1, 3 ) : List( 2, 3 )).tail is not true because the left-side does, and right-side doesn't, terminate.

Whereas, with CBV both sides non-terminate so the equality test is never reached. Thus CBN is too lazy with disjunctive coproducts, and in those cases non-terminates (including runtime exceptions) after doing more work than CPV would have.

[9] Declarative Continuations and Categorical Duality, Filinski, sections 2.2.1 Products and coproducts, 2.2.2 Terminal and initial objects, 2.5.2 CBV with lazy products, and 2.5.3 CBN with eager coproducts.

[10] Declarative Continuations and Categorical Duality, Filinski, sections 2.5.4 A comparison of CBV and CBN, and 3.6.1 CBV and CBN in the SCL.

keean commented

So we were supposed to be discussing subtyping here. If we don't stick to topic it will make it hard to find discussions in the future. I think we can discuss type-class syntax elsewhere.

I definitely want eager evaluation. Pervasive lazyness is a big pessimisation. Some kind of co-routines or yield would provide stream like functionality.

What are our conclusions about subtyping?

  • We will have 'set' subtyping of union/intersection/difference only
  • Type class inheritance is one of combining constraints, nothing to do with subtypes.

How we will implement a heterogeneous list that has an element type of a first-class anonymous structural union?

If I want to do some operations on this and not erase the union type, how will this be written in code? While still retaining the solution to the Expression Problem.

The answer seems to impact how I will think about how subtyping interacts with our unions and typeclasses.

keean commented
data List = Nil() | Cons(Int | String, List)

However, this is using | to union types and create sum types, which I think is confusing. Some alternatives with distinction between the two:

data List = Nil() | Cons(Int \/ String, List)

or

data List = Nil() + Cons(Int | String, List) 

Of course if its parametric on element:

data List<a> = Nil() + Cons(a, List<a>)

and that can be instantiated with a union type.

@keean wrote:

data List = Nil() | Cons(Int | String, List)

Not generic. Breaks other externalities dealing with extension.

data List<a> = Nil() + Cons(a, List<a>)

Explain how to use this per other requirements I stated.

@shelby3 wrote:

If I want to do some operations on this and not erase the union type, how will this be written in code? While still retaining the solution to the Expression Problem.

keean commented

Another possibility:

data List = Nil() | Cons([Int, String], List)

@keean wrote

data List = Nil() | Cons([Int, String], List)

As you requested not to do, are we going to continue mixing off-topic choice-of-preferred-syntax discussions in a conceptual Issue #8 about interaction of type system features?

keean commented

As you requested not to do, are we going to continue mixing syntax discussions in a conceptual Issue about interaction of type system features?

I think its better to try and stick to subtyping here... as long as we both understand the notation we are using.

In some regards we can imagine some boiler plate like this:

data UnionIntFloatString = I(Int) | F(Float) | S(String) 

Really we just want to allow the type system to infer the above for various unions automatically.
'I', 'F' and 'S' are the runtime type tags that you can case match on.

@shelby3 wrote:

data List<a> = Nil() + Cons(a, List<a>)

Explain how to use this per other requirements I stated.

@shelby3 wrote:

If I want to do some operations on this and not erase the union type, how will this be written in code? While still retaining the solution to the Expression Problem.

Btw, I think the solution is going to require higher-kinds. I will open a new Issue on Higher-kinds when I return from jogging. I already composed some of the OP for that new Issue.

keean commented

No need for higher kinds yet.

data List<a> = Nil() + Cons(head: a, tail: List<a>) // using + for sum types

append_it = (list, x) =>
    Cons(head: if x then "ABC" else 123, tail: list)

print_it = (list) => 
    typematch list.head:
        String(s) -> print_string(s)
        Int(i) -> print_int(i)
    if list.tail /= Nil():
        print_it(list.tail)

Inferred types

append_it(list : List<String | Int>, x : Bool) : List<String | Int>
print_it(list : List<String | Int>)

Inferring the type of Nil() is tricky, but not impossible, it effectively is typed as List<a> where a is a floating type variable (un-grounded). Later in the program a needs to get grounded at some point, as to have an un-grounded type variable in a program is an type-checking error.

Generic sort typeclass which works for any container type that implements it?

keean commented
quicksort<A>(c : A, lo : ValueType<A>, hi : ValueType<A>)
        where Cmp<ValueType<A>>, IndexedIterator<A> =>
    if lo < hi then
        p = partition(c, lo, hi)
        quicksort(c, lo, p)
        quicksort(c, p + 1, hi)

partition<A>(c : A, lo : ValueType<A>, hi : ValueType<A>)
        where Cmp<ValueType<A>>, IndexedIterator<A> =>
    pivot := c[lo]
    i = lo – 1
    j = hi + 1
    while true:
        do: 
            i := i + 1
        while c[i] < pivot

        do:
            j := j – 1
        while c[j] > pivot

        if i >= j :
            return j

        swap(c[i], c[j])

Note the associated types ValueType<A> I am not certain of the syntax, but a value type is like an 'output' type from a type-class which is defined by a particular instance.

The key thing is the comparison operators, which would be type-class operators. Something like:

trait Cmp<A>:
    `<`<A>(x : A, y : A) : Bool
    `>`<A>(x : A, y : A) : Bool

This would have to be defined for whatever the contents of the list were, so we would need some kind of instance like:

impl Cmp<Int | String>:
    ...

We need an implementation for Int | String as we need to define the relative ordering of both types in one dimension.

keean commented

As a follow up, "Elements of Programming" gives the following type for sort_n (page 207) which is a little better thought through than mine:

sort_n<I, R>(f : I, n : DistanceType<I>, r : R)
        where Mutable<I>, ForwardIterator<I>, Relation<R>, ValueType<I> == Domain<R> =>

Which allows the comparison function r to be passed into sort.

This requires a type-equality operator, which is in effect an infix type-class, and could be written TypeEq<ValueType<I>, Domain<R>> the definition is straightforward:

trait TypeEq<X, Y>
impl<X> TypeEq<X, X>

Ah, what you first wrote 2 hours ago (with A<B> which you've replaced) was very incomplete, i.e. you didn't define partition nor the < operator on the generic types. So I reloaded the page and seen you've gone off on a similar thought process that I've been going through. I will review your code now. I was off on a music tangent for past 2 hours. I spontaneously needed a morsel-sized respite from the compsci stuff.

Indeed it brings back to focus the discussions we had at the Rust forum about iterators and whether higher-kinds (or just a self type) are needed.

I don't see where swap(c[i], c[j]) is input? Also this is an invariant List (required by subsuming heterogeneous unions remember) so you can't implement a swap, because if it were possible to mutate it would break other references into various Cons in the single-linked-list.

Soon you will discover why I said I think we need higher-kinds. Hint: we need a factory.

keean commented

@shelby3 wrote:

I don't see where swap(c[i], c[j]) is input

swap is a generic function that swaps to values. In this case array lookup returns a reference, to swap is exchanging two values of whatever type the collection is. It is a top level function defined like this:

swap<A>(x : A, y : A) where Mutable<A> =>
    tmp = read(x)
    write(x, y)
    write(y, tmp)

Note it is defined on Mutable. In my code above just consider IndexedIterator extends Mutable for simplicity.

For a heterogeneous array the values would both be of type String | Int for example (every value in the array must have the same type, but that type can be a union).

There are no singly linked lists in my code, an IndexedIterator implies the values in the container are individually addressable by an index.

However it would not make any difference if it was a linked list, as the whole list is constrained by a single type bound:

data List<a> = Nil() + Cons(head: a, tail: List<a>)

This list is monomorphic, every element has type List<a> so there is no problem swapping values because they all have type A.

If you want to know whether any given value is Int or String you would have to typematch on the type.

@keean wrote:

For a heterogeneous array

Can't have subsuming hetergeneous unions on a variant data structure. I had mentioned thus numerous times in my explanation of my solution to the Expression Problem, but I hadn't emphasized the "subsumption" aspect until now (although I did mention it in passing and also during our dicussions on the Rust forum).

I presume you are forgetting that with a List we can add new types to the disjunction (aka union) at the head which are not in the tail of the list, which the array data structure can't allow (we can't mutate the type of an instance of an array, but we can mutate the type of a new Cons head which I had explained to you in some comment over the past days). I have all the design concepts of my union and Expression Problem solution loaded up in my head, so I aware of all these factors.

I raised the challenge of a generic implementation that by implication will work on invariant lists (given our unions would require them), which afaics your code above does not accomplish.

I am thinking we will need a factory to accomplish it, thus higher-kinds for the genericity. I am thinking of a Monoid typeclass.

keean commented

Well, you can have a heterogeneous list, and you can swap the element, as all the elements have the same type, it clearly states this in the type definition:

data List<a> = Nil() + Cons(head: a, tail: List<a>)

Note how the tail has type List<a> the same as the left-hand-side. So the a is the same everywhere in the list.

The a can be a union type, and if you append any type to the list that type would be part of the single union type for the whole list. The compiler would have to analyse to code and gather all the possible types that could be put in the list and make a the union of all types that get added to the list.

That's what the type signature says. If you want different behaviour, you will need a different datatype.

@keean wrote:

as all the elements have the same type, it clearly states this in the type definition:

data List<a> = Nil() + Cons(head: a, tail: List<a>)

No it doesn't unless you presume Haskell's lack of subsumption.

I already had twice provided an example of augmenting the union type for the head when we construct a Cons.

@shelby3 wrote:

The type of the type parameter A in Cons<A> will be subsumed to the GLB of the union of two types used to construct a Cons. I had already explained that as follows.

@shelby3 wrote:

If we use this syntax in ZenScript then when instantiating a Cons("string", Cons(1, Nil)) then the type will be inferred List<Number|String>. And if we instantiate first a let x = Cons(1, Nil) its type will be inferred List<Number>. And if we then instantiate Cons("string", x) then the type will be inferred List<Number|String>.

Note in the above quoted text, I was referring to a data type List not a typeclass List. Refer to that quoted comment for the declaration I employed there (which differs from the List in this comment).

Note Number is a subtype of Number|String.

You are apparently still thinking in terms of Haskell and its inability to subsume to a first-class union, which doesn't apply as I had already explained upthread.

@shelby3 wrote:

@keean wrote:

In Haskell this:

data Nil = Nil
data Cons a b = Cons a b
class List a
instance List Nil
instance (List b) => List (Cons a b)

We need to remember that Haskell does not allow heterogeneous unions, because I've read that at least it would break the global inference of Haskell.

Thus afaik in the above b will also be the same as a | Nothing which is just a where Nothing is at the top of all types (because Haskell's call-by-name type system is inverted so we use Bottom type where we would use Top type in a call-by-value language1).

So afaics, that is not specializing the List typeclass for every data type a that can be put into the List because only one homogeneous type can be put into any list object due to Haskell's type system restrictions (lack of a first-class anonymous structural union type). There will ever be only two implementations (aka instances) of List: Nil and List (Cons a b) where b is (List a) | Nil and Nil is List Nothing.

I presume the same for Rust, but bottom type instead of top.

But for ZenScript we are proposing to support heterogeneous lists, so I am trying figure out now what changes and what the problems and challenges are. I am thinking we will need higher-kinded types and there may be other problems.

To clarify (and correct) the quoted text, Haskell doesn't allow subsumption to a common supertype (aka GLB). There is no subtyping in Haskell, the entire type system is upside down coinductive. Whereas, ZenScript is proposing to support subsumption to the supertype union. And this means that attempting to append an element with a different type requires data structures are invariant, which means array will not type check. But a list will. Btw, I had explained all of this to you at the Rust forum, and I am remembering now. But it hadn't clicked for you yet, so I was probably speaking gibberish or Klingon.

I think when I speak of subsumption and subtyping, you tended to think it was irrelevant (or by not understanding it didn't register), as you admitted to me.

@keean wrote:

I guess I don't understand your solution because you are explaining it using reference to subtyping, like greatest common bound. With parametric types, and polymorphic functions you don't have any of this.

This is what I find confusing, you keep referring to the way languages like Scala handle classes, inheritance, subtyping, and subsumption, whereas I am talking about a type and type class based approach.

So that is probably why some things I wrote didn't register.

It is understandable, because you were approaching this from the presumption of Haskell's lack of subtyping.

Meta: apology if any of my words are coming across as acrimonious, disrespectful, or anything like that. I am trying to rectify it. I am very happy that you are taking on this project. I don't know why my words come out that way, I mean that I can't always word in a way that comes across as building great teamwork. I am wound up in a high amount of "type A" hypertension and worry. I really want to get this perfect. And I am worried (about many things, not just this language, but also this language). I realize it is very easy to fail with design of a programming language. I've had some successes in my life and string of failures lately. I don't want to fail.

I'm under extreme time pressure.

keean commented
data List<a> = Nil() + Cons(head: a, tail: List<a>)

because the a is a type parameter it has to be _exactly the same on both sides. If you want subsumption it has to be expressed as a type constraint. For example I think you want:

data List<a> = Nil() + Cons(head: a, tail: List<b>)

Now the tail list and the new list dont have to contain the same type, but the problem is b is not defined, so the above is not a valid type :-(

So to do what you want you have to use an HList type construct:

data Nil = Nil()
data Cons<A, B> = Cons(A, B)
trait List
impl List for Nil
impl List for Cons<A, B> where List<B>

We still need to work out the syntax for type constraints on trait implementations, but if we go with the above, this defines a list you can extend as you wanted, except there is no dynamic (runtime) polymorphism.

Now you can add types to the list as you wanted, except the list must be statically determined at compile time.

No offense intended, but to be curt... you are thinking about it wrong. Because your mind is apparently still stuck on the way Haskell infers types.

Haskell never subsumes (it doesn't even have subtyping at all). With inferred subsumption, then what I wrote is correct.

Please see the new Higher-kinded Issue #10. I think it best summarizes the issue at hand.

Also please note that implementation of sort you attempted had no typeclass List. Please explain what this type families concept is and why you would ever want a List typeclass? Feel free to make a new Issue for that topic if it is not related to Subtyping.

@keean wrote:

data List<a> = Nil() + Cons(head: a, tail: List<a>)

because the a is a type parameter it has to be exactly the same on both sides. If you want subsumption it has to be expressed as a type constraint.

Sorry you are incorrect. The inference algorithm can choose an value for a which satisfies the constraints. Since unions are available, it can subsume to the GLB which is the disjoint union.

You've apparently not been around Scala to know that is the way it is done with subtyping (even if subclassing is removed). And you've apparently tried to shoehorn the Haskell way of doing things when you do C++ or Rust. Well I don't know that, but seems maybe to be the case. I don't mean that as a criticism, because you are very strong in Haskell and I am weak in Haskell. But we can't use Haskell to solve the Expression Problem, because its type system won't allow it.

I already noted that you admitted having a myopia or resistance to any concepts related to subtyping. You even claimed that a programming language shouldn't have a Top and Bottom type. This comes from your exposure to Haskell's coinductive type system which places Bottom at the top and Top at the bottom of the type system, which disallows subtyping and moves non-termination (i.e. Bottom type) from an effect to a value.

Please try to understand. We've been talking about this for weeks during April/May and now again about 2 weeks. I hope we can finally understand each other on this issue?

Edit: please don't conflate all the bad of subclassing with subtyping. Subtyping is another dimension in the multi-dimensional space of type systems. If we discard it, we lose power1.

  1. #10
    #2

In this comment, I will write 'type parameter' where I mean the "type parameter on a type constructor" and not type parametrization of a function (where examples of type constructors include collections Array or List).

Type parameter variance

For those readers who don't know how subclassing (note I did not write 'subtyping') interacts with typing the variance of the type parameter of a type constructor, note that all programming languages which have subclassing have to manage the variance of the type parameter.

Default variance

Some languages by default make all such type parameters invariant, meaning that it is never possible to assign a subtype or supertype to any instance of the said type parameter. So an invariant Array<T> would mean you can never assign any supertype or subtype of T (what ever T is for the constructed instance of Array<T>).

TypeScript has by default something they refer to as bivariance which is incorrect, unsound, and allegedly unsoundness more often than they claim.

Variance annotations

Most languages offer some form of variance of said type parameters. For those languages that allow declaration of the variance, there are two variants: use-site and declaration-site variance annotations. Java has use-site and N4JS has both use-site and declaration-site. Afaik, Scala and Ceylon (and perhaps Kotlin and others) have declaration-site variance annotations.

Declaration-site variance

I will describe declaration-site variance. We define SUB to be a subtype of SUP (i.e. SUP is a supertype of SUB). A type parameter declared covariant, means that an instance of Array<SUB> can be assigned to a reference of type Array<SUP>, but not vice versa. And conversely for a type parameter declared contravariant, means that an instance of Array<SUP> can be assigned to a reference of type Array<SUB>, but not vice versa.

In order to enforce that the above allowed assignments won't cause the type of reference to an Array to become invalid, the compiler checks all instances of the type parameter are congruent with the declared variance. For covariant, this means the type parameter can never be used for method arguments but can be used for result values, which is why Ceylon chose out for this annotation. And conversely, for contravariant this means the type parameter can never be used for result values but can be used for method arguments, which is why Ceylon chose in for this annotation. As you probably expected, Scala uses some arbitrary symbol soup ASCII art which is impossible to intuitively describe and remember.

Typeclass objects

If Rust's typeclass objects (which I think will also be in ZenScript?) allow subsumption to their supertypes (i.e. when a trait extends another trait), then it is also subject to variance of type parameters. Except afaik Rust's type parameters are always invariant, because subsuming the trait object doesn't usually make sense, since a trait bound is always open to extension for new data types, unlike subclassing. And there is no subtyping in Haskell.

ZenScript

I don't think we've made decision yet whether ZenScript's type parameters will be invariant for typeclass bounds of typeclass objects? For our union types, invariant is sufficient except we must make them covariant (and/or contravariant) when we want to offer extension of variants (as a solution to the Expression Problem) in for example preexisting instances of List.

I hope this helped everyone understand what is going on with subtyping, variance of type parameters on constructors, and the differences thereof w.r.t. to subclassing, typeclass objects, and our new proposed unions.

keean commented

@shelby3

Scala is not a good example of a well designed type system. Personally I don't think the original type system was designed by someone who understood type systems, although they are now trying to fix their mistakes.

I don't have miopia when it comes to types, I understand what you are saying, but type systems are a logic, and you cannot just make the semantics up. They have to confirm to the behaviours of the underlying mathematical logic. Consider a type system a pure Prolog program (Horn clause logic effectively).

You do not want inferred subsumption. That is like inferred casting if number types in C, it introduces a magic black box into the type system that you can never get rid of. You are stuck forever with the language doing weird things you don't want it to. It is much better to have a sound type system and implement whatever weird functionality you want in a library that the user can optionally import if they want whatever flavour of weird behaviour it implements.

keean commented

@shelby3 this is important:

Subtype polymorphism is replaced in type class languages by type classes.

Consider an Array <T> in a language with subtyping you can put any function that is a subtype of T into the array. But what is a subtype? It is a type that implements at least the functions of the interface of T. So any type you insert must implement a certain set of functions (Liskov substitution principle). If we instead of giving an object type that defined the set of functions we must define, we instead give an interface (type-class), then we no longer need the mess that is variance.

keean commented

Further from the previous comment, what would the interface to a union type look like. You want to be able to inject a type, and project the type, using type indexed lookup. Let's ignore the implementation, what you get is something like:

trait Union T
   inj<A>(u : T, x : A) // inject value of type `A` into the union
   prj<A>(u : T) : Maybe<A> // project value of type `A` from the union

Now we can have Array<Union> and we can insert a value of any type, and read it back (providing we know the type we are looking for, a bit like a type-safe and sound cast).

keean commented

Perhaps I should add a little explanation of my type system methodology. If you make a bunch of stuff up (see Scala) you then have to go through a huge process of proving soundness, which is hard, and easy to get wrong (hence the unsoundness that occurs in other languages).

Fortunately there is a better way. If we consider types as atoms in a pure sound subset of Prolog, we can write programs in this Prolog to give the desired behaviours of the type system. Then we only have to prove the Prolog sound (which had been done for Prolog variants used in proofs, see "Prolog PTTP".

It happens that type-classes are the same thing, effectively a pure Prolog program (if we assume backtracking instance matching with no overlapping instances). So taking a language of atoms which represent types, and invariant variables (so we have a logic) we can write type system extensions that are definitely sound.

Now we can always have some syntax sugar to make things usable for the programmer, but if you cannot implement the features you want in this 'meta language' it is probably not sound.

@keean wrote:

You do not want inferred subsumption.

This is getting tiring. You are contradicting yourself. You already agreed in the Union issue discussion which you marked as "Accepted" that we would infer if(x) "string" else 1 to be a Number|String.

I think we better move our discussion to private messages for a while and see if we can understand each other. I will message you in LinkedIn after I sleep when I wake up.

I want to speak with you frankly off the public record.

So many languages infer the union subsumption now including Ceylon, TypeScript, etc..

If you make a bunch of stuff up (see Scala) you then have to go through a huge process of proving soundness, which is hard, and easy to get wrong (hence the unsoundness that occurs in other languages).

Making a great language is hard. That isn't an excuse for making a crappy boilerplate (thus unpopular) language. Without implicit, information preserving, lossless subsumption to the union, I am gone. Simple as that. And I will take the name with me and my ideas with me. (Of course, you can continue to use the name regardless, I just mean I can also use it)

I invested a lot of effort here. 🚽

@keean wrote:

You do not want inferred subsumption. That is like inferred casting if number types in C, it introduces a magic black box into the type system that you can never get rid of. You are stuck forever with the language doing weird things you don't want it to.

I am reasonably sure you are incorrect, because the subsumption is disjoint and information preserving in our case, i.e. we are not discarding any information, which the other forms of casting and subsumption do.

Thus, I am nearly sure you are wrong. Of course it needs to be formalized to check.

Sorry I need to be more blunt, because losing too much precious time. I need to know if we are creating a language I can use or not.

@keean wrote:

Subtype polymorphism is replaced in type class languages by type classes.

Incorrect. Subclassing polymorphism is replaced by type classes in Haskell. Subtyping wasn't replaced by anything, rather it doesn't exist in Haskell's coinductive, upside-down type system.

You continue to make egregious errors because you Can Only Think One Way™ (the way of Haskell's limitations).

If we instead of giving an object type that defined the set of functions we must define, we instead give an interface (type-class), then we no longer need the mess that is variance.

Incorrect.

As I explained already, this is only true because we choose to after subsuming data types to trait bounds, not further subsuming typeclass objects (aka Rust trait objects) to GLB of a set of trait bounds that extend from a common trait bound. That is named invariant. Otherwise, subtyping and variance comes into play same as for subclassing because the trait object is constructed instance's type parameter.

Analogously when we choose not to subsume further the constructed instance's type parameter when employing subclassing, then variance also does not come into play. That is named invariant.

keean commented

So what is wrong with the Array<Union> solution i gave? As I said, if you cannot formally define the typing feature you want, it is probably unsound. Just because the definition looks like boilderplate does not mean the user would ever see it. It may be in a library, or it may be inside the compiler (like a lot of type system weirdness like subsumption rules in other languages). I prefer to have it in a library, because then if I am interested, and I understand the formalism, I can read the exact rules that will be used by the type system.

Don't confuse formally defining type system rules, with programs the users would write, just because we can define these rules using type-classes.

keean commented

@shelby3 wrote:

Incorrect.

As I explained already, this is only true because we choose to after subsuming data types to trait bounds, not further subsuming typeclass objects (aka Rust trait objects) to GLB of a set of trait bounds that extend from a common trait bound. That is named invariant. Otherwise, subtyping and variance comes into play same as for subclassing because the trait object is constructed instance's type parameter.

Analogously when we choose not to subsume further the constructed instance's type parameter when employing subclassing, then variance also does not come into play. That is named invariant.

I think you are forgetting about existential types (like Rust trait objects). You can have (I don't know what our syntax for an existential type would be)

data AnyUnion = forall A . (Union<A>) => AnyUnion A

Now an Array<AnyUnion> can have a different implementation of 'Union' in each element.

@keean wrote:

So what is wrong with the Array<Union> solution i gave?

There is no Maybe involved at all. All types in the union have a type-case boilerplate injected by the compiler to implement any trait bound at the call-site where the union is passed as an input. Every where else the compiler is tracking the type of union only.

@keean wrote:

I think you are forgetting about existential types (like Rust trait objects).

I mentioned trait objects in what you quoted.

Now an Array<AnyUnion> can have a different implementation of 'Union' in each element.

Per this and my prior comment, you are back to premature virtualization (dynamic dispatch) which was the problem I was fixing. You are going backwards. This is getting very tiring.

You can't seem to wrap your mind around my solution to the Expression Problem, even I have explained every possible way I can think of to explain it. In #10, I provided another good explanation.

keean commented

@shelby3 I don't understand what you are saying here:

There is no Maybe involved at all. All types in the union have a type-case boilerplate injected by the compiler to implement any trait bound at the call-site where the union is passed as an input. Every where else the compiler is tracking the type of union only.

You are back to premature virtualization which was the problem I was fixing. You are going backwards. This is getting very tiring

What do you mean by 'premature virtualization?'

@keean wrote:

I don't understand what you are saying here:

I am too sleepy to try to explain my solution to the Expression Problem for the umpteenth time. Could you perhaps read what is linked?

Maybe when I wakeup, I will be re-energized.

What do you mean by 'premature virtualization?'

Ditto above.

keean commented

Lets change tack (but still on subtyping). If I have a list of the type you describe, and we assume set based subtyping:

let x1 = Nil() // inferred type List<A> where 'A' is an unknown type
let x2 = Cons(123, x1) // List<Int>
let x3 = Cons("A", x2) // List<Int|String>
let x4 = Cons(0.1, x3) // List<Int|String|Float>

now if I want to swap the values in position 1 and 3, I can put 123 into slot 1, but I cannot put 0.1 into slot 3. I can of course build a new list (so this is no good for in-place sorting, and worse only works on a single container type, not on my arrays, maps, sets, queues, dequeues etc.)

But what happens with this:

let x1 = Nil()
let x2 = if random_bool then Cons("ABC", x1) else Cons(123, x1)
let x3 = if random_bool then Cons("ABC", x2) else Cons(123, x2)

what is the type of x2? If we infer the union we would get List<String> | List<Int> which I don't think is what we would want.

What is the type of x3? List<Int> | List<String> | List<Int|String> which again I don't think is what we want.

@keean wrote:

Lets change tack (but still on subtyping).

You need to understand my solution to the Expression Problem. I provided you the link.

I can of course build a new list (so this is no good for in-place sorting, and worse only works on a single container type, not on my arrays, maps, sets, queues, dequeues etc.)

It might be possible to build those other container types in a way that is compatible.

Map would probably be easy, because you simply check the counter before you return it. I won't go into detail now... Ditto Set.

If you want to use an Array or any standard data structure, then serialize the List into it. Again we need higher-kinds #10.

what is the type of x2? ... If we infer the union we would get List<String> | List<Int> which I don't think is what we would want.

Since these are data types and not trait bounds, then List<String> | List<Int> is a subtype of (and can be subsumed to) List<String|Int> but they are not equivalent, i.e. an instance of List<String|Int> can't be assigned to a reference with type List<String> | List<Int>.

This is just math of sets.

What is the type of x3? List<Int> | List<String> | List<Int|String> which again I don't think is what we want.

Ditto what I wrote for x2

keean commented

Ah, but that's not true in general, because it depends on the properties of a List. With an arbitrary generic there is no rule for how the type-parameter is used.

So those rules would have to be defined per type. So somewhere where we define List we would also have to add a bunch of typing rules to make that happen in the right way for lists.

keean commented

@shelby3 I don't think any of the functionality for the List contradicts what I wrote above, I was just trying to make it work for an Array.

As I said elsewhere I am happy with subtyping based on sets, I am just thinking out-loud on how to combine sets based subtyping with otherwise invariant types, type classes etc.

@keean wrote:

Ah, but that's not true in general, because it depends on the properties of a List. With an arbitrary generic there is no rule for how the type-parameter is used.

So those rules would have to be defined per type. So somewhere where we define List we would also have to add a bunch of typing rules to make that happen in the right way for lists.

Only the variance of the type parameter need be defined. What I wrote assumed covariant. Contravariant types are weird thus less frequent, basically coinductive such as a stream.

keean commented

@shelby3 wrote:

Only the variance of the type parameter need be defined. What I wrote assumed covariant. Contravariant types are weird thus less frequent, basically coinductive such as a stream.

I think its more complicated than that, because the type class may not be a container/collection at all.

For example what would you do with the type-equality class (trying on some of the new syntax for size):

pluggable TypeEq<A, B>

impl<A> TypeEq<A, A>

So type-eq constrains two types to be the same. What would happen with TypeEq<A, B> | TypeEq<B, C>

In fact I don't think what you suggest can work with multi-parameter type classes at all?

@keean wrote:

I think its more complicated than that, because the type class may not be a container/collection at all.

We aren't subsuming typeclass bounds into unions (we haven't yet decided whether to implicitly subsume them to their GLB common supertype typeclass¹).

Only data types go into unions. Remember we agreed very early that unions of typeclasses don't make any sense, which you wrote here, you reiterated, and then I agreed. And where I explained why we need unions for data types, I agreed again not for trait (typeclass) bounds.


  1. @shelby3 wrote:

    Typeclass objects

    If Rust's typeclass objects (which I think will also be in ZenScript?) allow subsumption to their supertypes (i.e. when a trait extends another trait), then it is also subject to variance of type parameters. Except afaik Rust's type parameters are always invariant, because subsuming the trait object doesn't usually make sense, since a trait bound is always open to extension for new data types, unlike subclassing. And there is no subtyping in Haskell.

    ZenScript

    I don't think we've made decision yet whether ZenScript's type parameters will be invariant for typeclass bounds of typeclass objects?

keean commented

Okay so what about multi parameter datatypes like a pair:

data Pair<A, B> = Pair(A, B)

what would happen to Pair(Int, Float) | Pair(String, Char) ?

keean commented

My vote is to only allow set based subtyping, so that Int <: Int | String and that is it.

I think we need to have per-datatype rules if we want to have any type isomorphisms for unions.

Please note upthread edit:

  1. Since an eager, CBV language makes Bottom an effect, I have argued the name of the Bottom type should be Never and not Nothing (which appears to be the choice TypeScript made). Bottom is for example the type of functions that never return (terminate). Whereas, in a lazy, CBN language such as Haskell, Bottom becomes a value, so I argue it should be named Nothing and not Never.
keean commented

I don't think we need a bottom type. Failure to type check will be failure. Where do you see a bottom being necessary?

Isn't that the billion dollar mistake?

I would argue that when you want a type that is Nothing, you have to use Maybe (Haskell) or Option (Rust). I am not sure which is better but I am used to Maybe:

data Maybe<A> = Nothing() | Just(A)

@keean wrote:

I don't think we need a bottom type. Failure to type check will be failure. Where do you see a bottom being necessary?

All ZenScript functions will always return? What type will you give to functions that never terminate?

@shelby3 wrote:

Bottom is for example the type of functions that never return (terminate).

We can't properly model Nil in the type system without the concept of Bottom. Also Bottom is necessary for modeling LUB (least upper bounds) which is for example necessary in the contravariant case.

You need to understand that the type inference engine has to think in terms of GLB and LUB else it won't be correct. This is not Haskell. I don't know how many times I need to repeat that.


@keean wrote:

Isn't that the billion dollar mistake?

No that was Hoare's opinion about null values. Remember I just wrote that in an eager, CBV language (such as ZenScript), then Bottom is never a value, rather only an effect. This is why Scala has a separate type for Null which is not the Bottom type Nothing. But Scala made a mistake by naming Bottom Nothing instead of Never because Scala is eager, CBV, so Bottom can never be instantiated.

keean commented

You don't need bottom, you type Nil polymorphically as: List<A>

No bottom, no variance!

I want a sound type system, and that means following the rules :-)

Edit: Not returning has nothing to do with it consider:

fn() =>
    while true:
    return 1

This never terminates, but has a return type of Int.

@keean wrote:

You don't need bottom...

Please answer:

@shelby3 wrote:

All ZenScript functions will always return? What type will you give to functions that never terminate?

keean commented

I answered in the edit above. Definitely nothing like that Scala subtype tree! Only subtyping for union (set) types.

The fact that someone had to draw that diagram to understand it shows what a bad idea it is :-)

@keean wrote:

I answered in the edit above.

You have not answered regarding the type of non-termination. You answered w.r.t. to the Nil value. You keep repeating the same mistake in your thought process. Non-termination is an effect in a eager, CBV language such as ZenScript (and most other programming languages on the planet other than Haskell), not a value like in Haskell.

keean commented

@shelby3 wrote:

All ZenScript functions will always return? What type will you give to functions that never terminate?

That does not make sense. Function return types have nothing to do with termination? I don't understand where you got such a strange idea from?

The function return type is determined statically from the code at compile time. This is the same in C, C++, Pascal, Ada, Rust ... in fact every statically typed language I know of.

@keean wrote:

That does not make sense. Function return types have nothing to do with termination? I don't understand where you got such a strange idea from?

What is the type of a function that throws always throws an exception? What is the type of exit() function? (such as if we want to allow ZenScript to be used in scenarios where PHP is currently employed)

You've been fooled by your familiarity with Haskell. In Haskell, non-termination is a value and inhabits every type, thus it never has to be declared as a value.

keean commented

Hold on, I see what you are talking about. You mean functions that do not return anything, rather than functions that never return? like "print".

PHP is not a programming language :-)

The answer is they return '()' or void. Its a type which has no values, thats all, it does not necessarily have a subtyping relation with other type...

@keean wrote:

Hold on, I see what you are talking about. You mean functions that do not return anything, rather than functions that never return? like print.

The answer is they return () or void. Its a type which has no values, thats all, it does not necessarily have a subtyping relation with other type...

No the singleton Unit type is not the Bottom type.

Bottom can't be instantiated in an eager, CBV typing system, thus it can't be a singleton. How many times do I need to repeat this (the bolded). Your mind is locked into Haskell's type system.

keean commented

See my example from above again:

fn() =>
    while true:
    return 1

The return type is Int, this function never returns.

keean commented

Example from C:

int fn() {
    while true;
    return 1;
}

@keean if your function never terminates, and you don't type it as Bottom, then you will have unsoundness.

TypeScript's documentation explains several use cases, including:

Because never is assignable to every type, a function returning never can be used when a callback returning a more specific type is required:

function test(cb: () => string) {
    let s = cb();
    return s;
}

test(() => "hello");
test(() => fail());
test(() => { throw new Error(); })
keean commented

So the compiler does termination analysis on the code? This is known as the halting problem, so any type system that includes never is by definition unsound.

For example consider the Mandelbrot recursion? How can you tell if the loop ever terminates? The border between termination and non-termination is an infinite detail fractal.

@keean wrote:

So the compiler does termination analysis on the code? This is known as the halting problem, so any type system that includes never is by definition unsound.

You are conflating orthogonal concepts.

Turing-completeness is due to unbounded recursion (of the tape). The type system is only modeling the body of the function. The typing system can't model run-time behavior such as infinite recursion.

keean commented

So what do you mean by 'non-termination' ?

@keean wrote:

So what do you mean by 'non-termination' ?

The body of the function has in every code path which doesn't return, e.g. throw but note a Bottom is subsumable so a single throw (non-termination path) combined with a concrete return could be subsumed to that concrete return value (but in the case of unions we can't subsume to a union | Never because Bottom is never a value). This excludes dependent typing, i.e. analysis of run-time values and run-time behavior.

keean commented

Consider this:

mandel(x0, y0, x, y) : never =>
    while (x*x + y*y < 2*2) {
        let xtemp = x*x - y*y + x0
        y = 2*x*y + y0
        x = xtemp
    }
}

Is this type correct?

keean commented

refresh, as i edited above.

keean commented

Basically if you cannot prove non-termination, then never is by definition unsound. The definition of soundness is that a program that passes type-checking cannot go wrong at runtime. A function that I type as never that actually returns is going-wrong.

@keean wrote:

Is this type correct?

No. Should be the Unit type, because it doesn't return a value.

Although it doesn't return in some scenarios but not all, thus:

@shelby3 wrote:

but note a Bottom is subsumable so a single throw (non-termination path) combined with a concrete return could be subsumed to that concrete return value

keean commented

@shelby3 what is the difference between this example:

// Function returning never must have unreachable end point
function infiniteLoop(): never {
    while (true) {
    }
}

And this:

mandel(x0, y0, x, y) : never =>
    while (x*x + y*y < 2*2) {
        let xtemp = x*x - y*y + x0
        y = 2*x*y + y0
        x = xtemp
    }
}

How does the compiler know the first terminates and the second sometimes does terminate? The only way is to evaluate the loop. This is exactly the halting problem.

The point is the type checker has to decide if the type is correct, and that triggers the halting problem

@keean wrote:

Basically if you cannot prove non-termination, then never is by definition unsound.

Incorrect, because it can be subsumed to any instantiable type.

A function that I type as never that actually returns is going-wrong.

Correct. But we are not doing that.

The main utility of Bottom in an eager, CBV language is so we can assign a type that is nothing to something that expects a type, e.g. Nil and the example provided by TypeScript. Which is why Scala chose Nothing name but Never is more consistent.