keean/zenscript

Modularity

Opened this issue · 300 comments

Prior discussion:

#14 (comment)

#33 (comment)

#8 (comment)

Please kindly move modularity discussion here and not pollute issue threads which are specialized on their thread title (speaking to myself also, not to any specific person), so that we can have modularity discussion organized in a thread of its own.

@shelby3
Associated types are fine and they constrain types on type level not on runtime level, right? (example: type family in typeclasses, pokemon example)

@keean
I will try to recapture your problem with a code example (though, I'm not really familiar with ocaml), you can correct me, if I'm wrong:

signature EQ = sig
type t
val eq : t * t -> bool
end

structure EqInt = struct
type t = int
val eq = Int.eq
end

fun f (X : EQ) = struct
type t = X.t
val eq = X.eq
end

Function f should have a signature of the form EQ where type t -> EQ where type t
which can be erased even if the type t from X is not known explicitly.
Why?
Every time you construct a signature instance EQ with certain type t, the compiler will/could create a monomorphized function for it. When a module of type EQ is applied to f, the compiler should decide to pick the right function at compile time, which should be possible because of type inference.
Even if you change X.t from int to float (don't know if possible), you state the type for the type change of X.t explicitly in your code, which should be recognized by the compiler.
The only problem that arises is if you eval a type out of a string which is created from "outside":

string<-readline...
X.t=eval_to_type(string)

Because of this, most statically typed languages disallow such things.

keean commented

@sighoya it would help if you referenced the problem you are questioning, I am not sure what you are trying to show or disprove?

I want to prove that dependent types are not needed for modules.

Edit:
reference

keean commented

You example has no runtime polymorphism. You need to introduce an existential type, so that you want to run Eq over a list of existential values.

You mean something like this:

signature EQ = forall x sig
type t=[x]
val eq : t * t -> bool
end

or

signature EQ = sig
type t=[forall x]
val eq : t * t -> bool
end

I don't know if the latter is a valid syntax, but I mean a heterogenous list, i.e. contain values of different types.
In the case of the latter example, fun f with:

fun f (X : EQ) = struct
type t = X.t
val eq = X.eq
end

should have signature of EQ where t=[forall x] ->EQ where t=[forall x],

meaning that t is a list where the elements are of type: Variant Type(runtime Union Type) for which no type erasure and no dependent type is needed.

keean commented

@sighoya

meaning that t is a list where the elements are of type: Variant Type(runtime Union Type) for which no type erasure and no dependent type is needed.

That is different because with a union type you are limited in which types you can put in. Given a "String|Int" you can only put a string or integer into the heterogeneous collection.

With an existential type we can specify the interface, say forall a . Eq a => a which allows any type that implements Eq to be put in the collection.

I can actually demonstrate the unsoundness without using an existential type, just think about polymorphic recursion. You cannot monomorphise a recursive polymorphic function.

keean commented

@sighoya

should have signature of EQ where t=[forall x] ->EQ where t=[forall x],

This type is impossible because the type escapes the existential.

@keean wrote

That is different because with a union type you are limited in which types you can put in. Given a "String|Int" you can only put a string or integer into the heterogeneous collection.

Thats right, therefore Variant Type != Union Type
@keean wrote

With an existential type we can specify the interface, say forall a . Eq a => a which allows any type that implements Eq to be put in the collection.

Aha, here an existential takes the role of a trait/typeclass object.
@keean wrote

I can actually demonstrate the unsoundness without using an existential type, just think about polymorphic recursion. You cannot monomorphise a recursive polymorphic function.

Corner cases/Undecidable issues
Do you think the following is sound?:
(+) :: Int->Int->Int
Int.MAX+Int.MAX=Error?

@keean wrote

@sighoya

should have signature of EQ where t=[forall x] ->EQ where t=[forall x],

This type is impossible because the type escapes the existential.

Yes it breaks the rules of existentials.
Maybe: f::EQ where t=Variant -> EQ where t=Variant is a better fit
or left out the associated type with: f::EQ->EQ
Why is it so important to include the associated type in the type signature?

Further, I don't see the need for dependent types for module purposes but maybe you mean runtime polymorphism is inevitable, then yes.
Runtime polymorphism is indeed very useful.

@keean

If a module has an associated type, and a module is first class (can be passed to a function as a value) then the associated type must depend on the value passed to the function. This is clearly the definition of a dependent type.

But if the instance of the module’s type is statically known at the time when the value is applied at the call-site then it’s no different than applying typeclasses at the call-site. So are typeclass bounds dependent typing?

I guess it’s because we erase the instance’s type from the static knowledge when we assign it to a module signature (but have to reify it with runtime dispatch), thus we don’t track it statically? Isn’t that the analogous to assigning an object to an existential type thus erasing it from static tracking?

The key difference is when the type of the value can’t be known statically, then we need to track the type of a value at runtime with some form of runtime dispatch. Which is analogous to existential types with typeclasses.

What am I missing? Trying to understand why you see a problem with this form of dependent typing that you don’t see with typeclasses?

I can actually demonstrate the unsoundness without using an existential type, just think about polymorphic recursion. You cannot monomorphise a recursive polymorphic function.

Ditto cannot monomorphise typeclass bounds with polymorphic recursion. Have to use runtime dispatch. We had a discussion about that in the past on these issue threads.


@sighoya wrote:

Further, I don't see the need for dependent types for module purposes but maybe you mean runtime polymorphism is inevitable, then yes.

Well I think he may be defining dependent type correctly, yet I agree that seems like it’s just runtime polymorphism. And I am also wondering why he dislikes it given it seems to occur because of the inevitable need for runtime polymorphism.

I thought dependent typing was when we can check the values of a value statically?? With runtime polymorphism we’re not enforcing the type of the value statically. There’s no monomorphisation going on.

keean commented

Ditto typeclass bounds with polymorphic recursion.

No typeclasses only depend on the type of the thing, not the value of the thing.

Now you would have the same problem if you had type-classes with associated types, but in Haskell they get around it by saying the associated types must be globally coherent with the type parameters of the type class instances. This makes the type-class type-parameters, and the associated types of that type-class form a type-family.

keean commented

@sighoya a variant type is also limited in the types you can put into it (at least it is in OCaml). Without existentials (runtime polymorphism) polymorphic recursion can still cause problems with modules that requires dependent types to support.

Now you can have modules without dependent types, if you limit the language to having everything monomorphisable. This means no runtime polymorphism, no polymorphic recursion etc.

@keean wrote

@sighoya a variant type is also limited in the types you can put into it (at least it is in OCaml).

My thoughts were onto dlang's Variant Type.
You can also further restrict dlangs Variant Type to allow assignments of values of specific types (they call them Algebraic Type, in the same url I presented).
A variant type is a any type without the sh?t of inheritance.

Edit:
A typeclass/trait object can also be seen as a Variant Type which is bounded to the types implementing the trait/typeclass. In some situations, a trait object can also be a Union Type (compile time Variant Type), in which case monomorphization can applies.

@shelby3

I thought dependent typing was when we can check the values of a value statically?? With runtime polymorphism we’re not enforcing the type of the value statically. There’s no monomorphisation going on.

I thought depending typing was when we cannot check the type of a variable at compile time (statically), because the variable's type depends on the variable's value and maybe of other variables's values.
However we can pattern match over it and can follow different branches in which the compiler knows the type for each branch (when done rightly), though it can become tedious.

Dependent types are indeed very useful. Imagine a type called DeterminantNotZeroMatrix, which is defined at follows:

DeterminantNotZeroMatrix<Type eltype,Int n,Int m>={Matrix<eltype,m,n> mat |  det(mat)!=0}

Edit: ambigious use of m, changed it to mat.

Once constructed this kind of dependent type, we can invert this Matrix without to check every time if the matrix is invertible.

But @keean, I think you need a runtime polymorphic time rather than dependent typing

There is a proposal to add type classes to C#. The funny thing is that the proposed implementation of typeclasses looks like a lot of a light form of Modular Type Classes

keean commented

@sighoya

My thoughts were onto dlang's Variant Type.
You can also further restrict dlangs Variant Type to allow assignments of values of specific types (they call them Algebraic Type, in the same url I presented). A variant type is a any type without the sh?t of inheritance.

This kind of variant is not modular, as there is no way to represent a universal type tag. This kind of variant requires Run Time Type Information, which is a big problem for me, as you lose type erasure (and theorems for free) as well. I can give more details why this is bad (or perhaps why I don't like it if I am being generous).

But @keean, I think you need a runtime polymorphic time rather than dependent typing

Runtime type information is worse (see above) with dependent typing we can still resolve issues at compile time. For example if we try and print a value of an associated type, we can add the "Print" constraint to that module. This can be propagated up to the interface specification for the current module such that we say the argument for this function must be an iterator where the ValueType of the iterator is printable. These are dependent types (because the type passed to print depends on the value of the module passed to the function) but we solve by placing constraints on the types which we propagate at compile time. The latest we would need to check these constraints is when linking a dynamically loaded module, we still retain type erasure and don't have to find a way to store types in memory at runtime.

I think its bad to store types in memory at runtime due to the Russel paradox. If every memory location has a type, and a memory location contains a type then the type of a type is a type and we have created the conditions for a paradox. Everything in memory at runtime is a value, now we can create a mapping from values to types, but we cannot do this globally, because with separate compilation we don't know what values other compilers assign to which types. So we have to locally map values to types which is what an OCaml variant does (a disjoint sum type).

@keean wrote

Runtime type information is worse (see above) with dependent typing we can still resolve issues at compile time. For example if we try and print a value of an associated type, we can add the "Print" constraint to that module.

Looks like for me you need a trait/typeclas object

@keean wrote

I think its bad to store types in memory at runtime due to the Russel paradox. If every memory location has a type, and a memory location contains a type then the type of a type is a type and we have created the conditions for a paradox.

The type Type is a meta type which is a non well founded set including itself

keean commented

@sighoya it seems you agree with my reasoning, but trait objects are Rusts name for existential types, and you are back to needing dependent types if you use modules.

It all comes down to the difference between a type class and a module. In a type class the implementation is chosen only using the type information of the type class parameters. In the lambda cube this makes an associated type a type that depends on another type. Because a module is first-class the associated type depends on the actual value of the module passed, not just its type, hence an associated type is a type that depends on a value (otherwise known as a dependent type).

@keean
Let me try to reflect.
You describe a case where a functor is taking a module and returning a module where the associated type of the returned module depends on the taking module.
You don't want to solve it with Variant Types because you don't like them, but you want to solve it with Existentials which have a constraint which depends, however, on the value of the taking module.
Your solution is to constrain the taking module by a dependent type, i.e. dependent signature in which only the desired modules are accepted by the functor.

In my eyes, it is one solution. But you can also force the functor to return a Maybe structure in which its contained/wrapped structure depends on the value condition.

keean commented

@sighoya not quite. With first class modules (and Functor) you can pass modules/functors as arguments to functions. Because associated types with modules depend on the value of a module and not just its type, you need dependent types even without variants, existentials etc. You just need first-class modules and associated types to have the problem.

keean commented

@sighoya Let me have another go, as I don't think I have done a very good job of explaining, and may have got a bit side-tracked by other issues. The type system only tracks the type of modules or functors. It is possible for two modules which conform to the same signature (type) to have different associated types. The associated types are not part of the signature.

signature EQ = sig
type t
val eq : t * t -> bool
end

structure EqInt = struct
type t = int
val eq = Int.eq
end

structure EqFloat = struct
type t = float
val eq = Float.eq
end

fun f (X : EQ) = struct
type t = X.t
val eq = X.eq
end

f(EqInt).eq(1, 2)
f(EqFloat).eq(1.2, 1.1)

Because EqInt and EqFloat are values of the type EQ the type system does not track the difference between them. All the type system sees is type EQ in order to see what value is passed to f we have to use partial-evaluation or abstract-interpretation. However nether of these methods can solve general computational problems, and you very quickly end up running the whole program. This is why compile-time evaluation is restricted to constant-expressions in C++ and other languages. In other words we could resolve this for limited cases, but most compilers don't even try.

@keean wrote:

The type system only tracks the type of modules or functors. It is possible for two modules which conform to the same signature (type) to have different associated types. The associated types are not part of the signature.

Right, This is the intention of associated types, to not include these in the type signature, otherwise, they are parameters.

@keean wrote:

Because EqInt and EqFloat are values of the type EQ the type system does not track the difference between them. All the type system sees is type EQ in order to see what value is passed to f we have to use partial-evaluation or abstract-interpretation.

Ah, I see your problem. The problem is that the signature of eq changes in dependence to the input module. And this is unmodular, right?
You can pattern match over it (=partial evaluation?), but cannot grasp all possible values with it (at least not in all cases)
Edit:
May the type system does not know the difference between EqInt and EqFloat, but the compiler knows it because you hard coded the type int to EqInt and the type float to EqFloat and the compiler can therefore typecheck the class of eq at compile time (For hidden/already compiled modules, there must be a way for the compiler to access the associated type, at least I would implemented it to do so)

I don't think the problem is because that modules are values. You even have the problem if the modules are defined statically but the definition of the modules are hidden from the user.

For example, you specialize a function f with some unknown static module EqA and call f with values depending on EqA.

f<EqA>(1,2) or f<EqA>("1","2") ?

I think it is not a problem that modules are first class values, the problem is that associated types are unmodular.

Ah.... I must correct, your are right:

a=io.read().asInt()
module m::EQ
if a=2 m =EqInt else m = EqFloat
f(m)(1,2) #allowable?

I could imagine to disallow:

a=io.read().asInt()
module m::EQ
if a=2 m =EqInt else m = EqFloat
f(m)(1,2) #allowable?

by the compiler and to pattern match over all possible outcomes, but the user cannot reason why the above code is not valid (because it is not inferable from the type signature), damnit!

keean commented

@sighoya We can only reason about code using the types, otherwise we are executing the code. If we execute the code we have an interpreter not a compiler. The whole point of a compiler is to translate code from one language (the source language) to another (the target language). This gets fuzzy when we can partially evaluate code where the inputs are static, but as soon as the input depends on IO we cannot even do that. So when you write short code examples it's easy to say "well the compiler knows this value", but in real code it es exceptionally rare, except for constant values like PI. Even with constant values you have the catch22 of not wanting to execute code until it has been type checked to make sure it does not crash the compiler, but your type checking depends on the values which require you to run the code. So in reality partial evaluation does not work for this kind of problem. That leaves abstract interpretation as the only method, and this is not a well known technique even amongst compiler writers. Dependent types are by far the best solution, or avoid the problem by forcing global coherence on module instances by making the associated types a type-family over the module type parameters ( what Haskell does).

Some Questions:

1.) How does Ocaml handle these things? Does it crash?

2.) What is if we force the user to pattern match over the associated type in a module of type EQ:

module m::EQ
m= EqInt if a=2 else EqFloat# now, the compiler knows that m.t could be different
f(m).eq(2,2) # Here the compiler knows eq must be of type Int->Int->Bool, so the user have to check if m.t is Int

The compiler will throw an error and forces the user to rewrite the code above to:

module m::EQ
m= EqInt if a=2 else EqFloat
match m.t=
Int -> f(m).eq(2,2)
_ -> Error/Nothing

What I'm missing?

3.) What is your plan to this problem?

@keean,

Ditto typeclass bounds with polymorphic recursion.

No typeclasses only depend on the type of the thing, not the value of the thing.

We discussed this already in the past in our discussion of rankN HRT. When we store callbacks that have a typeclass bound as values, we lose the definition of the function body. Thus there’s no way to monomorphise the body because we don’t know the body of the function that is the callback because more than one function definition can match the same set of typeclass bounds or alternatively the caller of the callback may be supplying existential types instead of statically known instances. We have to resort to runtime dispatch. So indeed typeclasses do depend on the value of the thing when we have to throw away static knowledge of which values correspond to which types in our imperative soup (also for example existential types aka trait objects). If we can eliminate the need for imperative soup…

Now you can have modules without dependent types, if you limit the language to having everything monomorphisable. This means no runtime polymorphism, no polymorphic recursion etc.

…if our modules are applicative/pure and all imperativity occurs top-down outside any modules, then in theory we can have polymorphic recursion without requiring runtime polymorphism in the modules, although at the junction between the imperative layer and the modular layer, must be runtime dispatch to monomorphised branches of the modular code.

Btw, the Pokemon example employing type family fails to solve the Expression Problem. New Pokemon types can’t be added independently. The code could be reorganized to not use associated types if runtime polymorphism is allowed and then it would also be more modular and extensible. So again I ask @keean what the objection to dependent typing is, if all it means is to enable runtime polymorphism? Runtime polymorphism is necessary for modularity, solving the Expression Problem, and because static typing is limited in expression?


@sighoya wrote:

Associated types are fine and they constrain types on type level not on runtime level, right? (example: type family in typeclasses, pokemon example)

Well even when simulated with modular abstract types, I don’t see why they couldn’t be monomorphized in the cases where there’s no runtime polymorphism?

I thought dependent typing was when we can check the values of a value statically?? With runtime polymorphism we’re not enforcing the type of the value statically. There’s no monomorphisation going on.

I thought depending typing was when we cannot check the type of a variable at compile time (statically), because the variable's type depends on the variable's value and maybe of other variables's values.

So we agree that dependent typing is the problem where we need to know the type of values. I’m saying that I thought dependent typing was the ability to type those values statically with total order knowledge about all possible states of the program. You’re saying that that actually it’s the ability to declare some invariants independent of the values, but isn’t that what a type is? An int is an invariant independent of the values it takes.

Dependent types are indeed very useful. Imagine a type called DeterminantNotZeroMatrix, which is defined at follows:

So what you’re saying is the user can create new types and enforce semantics on those types. But isn’t that what typing is for?

It seems to me that runtime polymorphism is somehow being conflated with dependent typing in our discussion?


@keean wrote:

This kind of variant requires Run Time Type Information, which is a big problem for me, as you lose type erasure (and theorems for free) as well. I can give more details why this is bad (or perhaps why I don't like it if I am being generous).

RTTI is bad because the programmer can access it and use it as a hack for polymorphism (c.f. the mess that is the Go libraries). Whereas what we really need is static/dynamic polymorphism and runtime only where it must be.

But are you conflating runtime polymorphism with RTTI? Existential types (aka Rust’s trait objects) don’t erase the runtime information about types, because it’s required for runtime dispatch. Yet the programmer has no access to the RTTI.

But @keean, I think you need a runtime polymorphic time[type] rather than dependent typing

Runtime type information is worse (see above) with dependent typing we can still resolve issues at compile time. For example if we try and print a value of an associated type, we can add the "Print" constraint to that module. This can be propagated up to the interface specification for the current module such that we say the argument for this function must be an iterator where the ValueType of the iterator is printable. These are dependent types (because the type passed to print depends on the value of the module passed to the function) but we solve by placing constraints on the types which we propagate at compile time. The latest we would need to check these constraints is when linking a dynamically loaded module, we still retain type erasure and don't have to find a way to store types in memory at runtime.

That presumes that modules aren’t first-class values:

You just need first-class modules and associated types to have the problem.

If they are values then you can’t know until runtime what the types of the abstract types are. So then you must use runtime dispatch.

I think its bad to store types in memory at runtime due to the Russel paradox. If every memory location has a type, and a memory location contains a type then the type of a type is a type and we have created the conditions for a paradox. Everything in memory at runtime is a value, now we can create a mapping from values to types, but we cannot do this globally, because with separate compilation we don't know what values other compilers assign to which types. So we have to locally map values to types which is what an OCaml variant does (a disjoint sum type).

We discussed this before in the past. Use a cryptographic hash to have decentralized creation of a universal tag.

Again I think you’re conflating access to RTTI as evil with type erasure as good. There’s a middle ground, which is RTTI for runtime dispatch is not evil (when the programmer can’t access that RTTI).

As for Russell’s paradox, it’s omnipresent in many different facets of programming and there’s absolutely nothing you can do to solve that. If you could solve it, then our universe would be provably bounded. Life is necessarily imperfect else we couldn’t exist.


@sighoya wrote:

The type Type is a meta type which is a non well founded set including itself

This is related to why we can’t have unbounded/open ADTs and use them to select implementation of typeclass instances, because then we can’t prove anything is ever disjoint. Monomorphism and the Expression Problem are in tension with each other, abstractly because of Russell’s paradox.


@keean wrote:

It is possible for two modules which conform to the same signature (type) to have different associated[abstract] types. The associated[instances of the abstract] types are not part of the signature.

ftfy. Associated types are a typeclass concept. Abstract types are a module concept. Afaik, the paper Modular Type Classes employs abstract types to emulate the associated types of typeclasses.

Because EqInt and EqFloat are values of the type EQ the type system does not track the difference between them. All the type system sees is type EQ in order to see what value is passed to f we have to use partial-evaluation or abstract-interpretation.

It seems you argued above that the type system can track the differences and monomorphise (in some cases) and I agreed except I pointed out that when the modules are first-class values that get passed around, then the instances of the signature can’t be known statically.

However nether of these methods can solve general computational problems, and you very quickly end up running the whole program.

What does that sentence mean? Do you mean that not everything can be monomorphised?

This is why compile-time evaluation is restricted to constant-expressions in C++ and other languages. In other words we could resolve this for limited cases, but most compilers don't even try.

Typeclass bounds at call sites also throw away the information about the types at the call site, but the difference is the static information is thrown away at the call site and not at the construction of module instance with a functor. In the case of typeclass bounds of callback functions, the body of the function is statically unknowable.

Since we know we can emulate typeclasses with modules per Modular Type Classes, we know that not all use of modules can’t be monomorphised. The key is really referential transparency. Because when we/compiler knows the code isn’t overwriting (generic references to) values with different types (which causes the requirement for the value restriction!), then the compiler can statically trace the type of the instance of a module to it’s use in a function and thus monomorphise. That is why C++ and other non-pure languages can’t do it.

All of us discussing here know that PL design has different axes that are highly interrelated. We must analyse holistically over all axes simultaneously in order to explore the design space.

We can only reason about code using the types, otherwise we are executing the code. If we execute the code we have an interpreter not a compiler. The whole point of a compiler is to translate code from one language (the source language) to another (the target language). This gets fuzzy when we can partially evaluate code where the inputs are static, but as soon as the input depends on IO we cannot even do that. So when you write short code examples it's easy to say "well the compiler knows this value", but in real code it es exceptionally rare, except for constant values like PI.

The key point of your comment is that when the code is no longer pure, then the compiler (nor the human) can reason about it.

Dependent types are by far the best solution, or avoid the problem by forcing global coherence on module instances by making the associated types a type-family over the module type parameters ( what Haskell does).

I wish we’d write abstract types instead of dependent types. Dependent types seems to be a more broad term. Yeah modules need abstract types as opposed to type parameters (because per prior discussion they otherwise lose encapsulation and can’t be non-explicitly/boilerplate composed in partial orders, i.e. must always know all the instance types when composing), and if you don’t have referential transparency, then probably cannot monomorphise them.

But referential transparency really is about top-down dependency injection. So in some cases we can top-down inject the types and monomorphise but in other cases the top-down imperative layer will lose static tracking of the types and thus we must punt to runtime polymorphism.

Per the link above, what I’m not clear on now is whether we really need both modules/typeclasses and algebraic effects? Seems like maybe all modular genericity can be modeled as bind points on the free monad?

I would appreciate you guys checking my logic on this.


@sighoya wrote:

The problem is that the signature of eq changes in dependence to the input module. And this is unmodular, right?

No that is modularity. Multiple implementations of the same signature. (Also multiple signatures for the same implementation).

@shelby3 wrote

You’re saying that that actually it’s the ability to declare some invariants independent of the values, but isn’t that what a type is? An int is an invariant independent of the values it takes.

@shelby3 wrote

So what you’re saying is the user can create new types and enforce semantics on those types. But isn’t that what typing is for?

It seems to me that runtime polymorphism is somehow being conflated with dependent typing in our discussion?

Valid point and I'am think you are right. What I presented was a refinement type which correlates with dependent types but are not the same.

After further investigations, Languages which support Variant Types already partially implements dependent types because
dependent types are function types where the Return Type is dependent on the values of the function parameters, i.e. the Return Type is a Variant Type in case of nondeterminism and a Union Type in case of determinism.
The described dependent type above is a dependent product type, i.e. the free parameter value tuple determines the Return Type.
However, you can also "chain" the dependent type generation meaning the parameter value tuple of a function is not free in itself, i.e. some values depend on other values in the parameter value tuple. Such a function is called dependent sum type, e.g. function ():a->a+1->(a,a+1).

The question is, why not simply use the Variant Type/Union Type or a Sum Type (Maybe, Either) to represent Varieties?
With dependent types you can encode value semantics in the type signature, e.g.:

concat :: Vec a n -> Vec a m -> Vec a (m+n)

.
This is so much expressive that such constructs are permitted:

fun::Vec a (x**2-y**4+z**4-z)-> (Vec a x, Vec a y, Vec a z)

Because the length of a Vector is integral, you can construct diophantine problems which are undecidable (though, for a computer nothing is undecidable because he can only see a finite subset of infinities). Edit: (Corrected)

@keean Again, I see not clearly why you need dependent types instead of simple runtime polymorphism (Sum Type, Variant Type)

@shelby3 wrote

Btw, the Pokemon example employing type family fails to solve the Expression Problem. New Pokemon types can’t be added independently.

The problem with the pokemon example is that Haskell is missing Union Type with the mechanism to typecheck the value and to monomorphize over it. They do this manually in this example which is bad practice, especially when the associated type Winner is of type * meaning I could instantiated the *-Type not only with the pokemon or foe type, but also with any other type.

@shelby3 wrote

So again I ask @keean what the objection to dependent typing is, if all it means is to enable runtime polymorphism? Runtime polymorphism is necessary for modularity and solving the Expression Problem?

Totally Agree.

@sighoya wrote:

The problem is that the signature of eq changes in dependence to the input module. And this is unmodular, right?

@shelby3 wrote

No that is modularity. Multiple implementations of the same signature. (Also multiple signatures for the same implementation).

Hmm... Yes. I've formulated it badly. What I meant is that the user cannot see how the signature changes at design time and compile time (at least not always). Therefore I presented a possible solution in my previous post which should overcome this problem, at least in my mind.

keean commented

So again I ask @keean what the objection to dependent typing is, if all it means is to enable runtime polymorphism? Runtime polymorphism is necessary for modularity and solving the Expression Problem?

Dependent typing breaks some desirable type system properties like decidability. If the type system becomes a fully usable turing-complete programming language, then why have two different turing complete programming languages. However it might be less problematic than the alternatives.

keean commented

@sighoya

Again, I see not clearly why you need dependent types instead of simple runtime polymorphism (Sum Type, Variant Type)

This is a complex question to answer, but it is along the lines of "I cannot clearly see why everyone cannot program in Lisp, as it is turing complete and you can express any program in it."

There is an existential question i'm always wondering along those lines.

Like let say a module implement a factory, who will create a complex object and sub objects based on a declaration.

Like let say a factory for an UI dialog, where controls can have different implementations ( eg text field vs password fields), but will be manipulated via a statically defined interface by the client program.

If the UI definition passed to the factory is statically defined in the client program, does that make the program statically or dynamically typed ?

The parameters to the factory in the case would not be exactly what can be called a type signature, so the compiler can't really figure which class is being instantiated, even if it know the interface / base class. Well in the absolute it could figure it out, but most likely it won't, even if the type of the classes are still statically determined given a static input to the factory.

And it would be virtually impossible to exactly type the whole UI dialog class, given it can contain objects of many different types as members.

In case where the modules instances are clearly based on dependant type system, it's even easier than this.

I wrote:

Since we know we can emulate typeclasses with modules per Modular Type Classes, we know that not all use of modules can’t be monomorphised. The key is really referential transparency. Because when we/compiler knows the code isn’t overwriting (generic references to) values with different types (which causes the requirement for the value restriction!), then the compiler can statically trace the type of the instance of a module to it’s use in a function and thus monomorphise. That is why C++ and other non-pure languages can’t do it.

[…] and if you don’t have referential transparency, then probably cannot monomorphise them.

But referential transparency really is about top-down dependency injection. So in some cases we can top-down inject the types and monomorphise but in other cases the top-down imperative layer will lose static tracking of the types and thus we must punt to runtime polymorphism.

Afaics (or I posit that), with pure composition can unify a single instance type with a monomorphisation of each use of the polymorphic signature type, because there’s no loops and other referential opaqueness which otherwise prevents reasoning about which instance type is in use at each use of the polymorphic signature type. IOW, referential transparency facilitates more monomorphisation.


@sighoya wrote:

[…] dependent types are function types where the Return Type is dependent on the values of the function parameters

That is one use case as I had shown in 2012 (where partial application returns a function with the dependent type).

Yet I also showed there that the distinction is really about the fact that an abstract type is an output type and it can otherwise be achieved with the family polymorphism of type parameters which are input types, as @keean has also pointed out.

Thus the distinction between dependent types and parametric polymorphism is that the former is a (bottom-up, coinductive) output type of values and the later is a (top-down) input type of static inductive code structure. However, afaics with referential transparency they’re both monomorphisable, because former can be unified as aforementioned. Without referential transparency, the first-order occurrences of the latter can still be monomorphised, because the type parameters are propagated statically through the inductive code structure regardless.

Again, I see not clearly why you need dependent types instead of simple runtime polymorphism (Sum Type, Variant Type)

I think there’s a conflation of concepts in our discussion. So let’s try to clarify now.

An abstract type contained with a modules/records is a means of polymorphism. Because of the polymorphism applies to values, we classify it as dependent typing. Afaics (or let’s say I posit that), whether it’s less monomorphisable than parametric or adhoc polymorphism depends on whether the code is referentially transparent.

Polymorphmic variants (aka row polymorphism ← click for our past discussions; and distinguish OCaml’s variants from polymorphic variants) also lose monomorphisation analogously to abstract types, but the underlying type that represents the ignored rows is not involved in the algorithm where the polymorphic variant is employed (because by definition the differing rows of the record for the variants are ignored). Thus unlike the arg1key.ValueType I showed in my 2012 example, there’s no dependence on knowing the actual types of the variants (except maybe for the extra generic row variable representing the rest of the rows?). With abstract types, the algorithm of the entire module depends on each instance type of the abstract type.

Apparently row polymorphism (and structural typing) have a performance disadvantage compared to ML modules (presuming can’t be monomorphised) in that require a hash of the row methods instead of a vtable pointer indirection, because each instance can have the rows in different orders or slices of the object. Whereas, for modules, the entire module is vtable pointer indirection for the dictionary of the module’s static interface, although that might not be true when the signature that the functor implements is a partial capture of the instance’s rows although perhaps the functor can specialize to a unified vtable format at the time of the instance construction? Type inference apparently (c.f. also) fails generally for both (i.e. no principal types), but at least annotating ML functors grants HKT but the HKT are non-unified and boilerplate mess (although I’m hoping to avoid the need for HKT entirely).

Here is a concise summary of different kinds of polymorphism. Afaics, modules, row polymorphism, and subtyping are subsuming— the generic type can have less members/rows than the instance. Parametric and adhoc polymorphism are not.


@keean wrote:

Dependent typing breaks some desirable type system properties like decidability. If the type system becomes a fully usable turing-complete programming language, then why have two different turing complete programming languages. However it might be less problematic than the alternatives.

Oic. Well apparently same issue for polymorphic variants aka row polymorphism.

What form of statically open (unbounded) runtime polymorphism doesn’t break decidability? Existential types (aka Rust’s trait objects)?

For statically bounded runtime polymorphism, note the new research on algebraic subtyping with all open possible unions in the ground types doesn’t break decidability and would probably work if we don’t need HKT.

Is “fine-grained” modularity an undesirable goal?

In general, attempting to modularize n deeply intertwined abstractions and data types will require 2^n - n additional libraries to cleanly represent instances and combinators that depend on overlapping functionality.

If we have 10 modules, for example, that means more than 1,000 libraries! Modern development stacks, all the way from Github to Scala to the JVM, were never designed for such fine-grained dependencies. Ignoring this fact imposes astronomical costs on development that far outweigh the perceived benefits.

Not all of these perceived benefits are actually benefits, either. For example, users never want a recent version of one part of the FP standard library (e.g. optics), but an older version of another part of the FP standard library (e.g. ct). Because of the interdependencies, such cherry-picking doesn’t make sense.

Modules can actually slow down development and releases, and make the creation of globally consistent version sets very difficult and rare. With 1,000 modules all independently versioned and released, how difficult would it be to find a set of versions that all work with each other? You probably know the answer to that one from experience!

@shelby3
Modularity is not the only paradigm you should go.
Note: modularity and locality are two paradigms which are mostly contrary to each other.

@NodixBlockchain wrote

There is an existential question i'm always wondering along those lines.

Like let say a module implement a factory, who will create a complex object and sub objects based on a declaration.

Like let say a factory for an UI dialog, where controls can have different implementations ( eg text field vs password fields), but will be manipulated via a statically defined interface by the client program.

If the UI definition passed to the factory is statically defined in the client program, does that make the program statically or dynamically typed ?

A statically (dynamically) typed language remains a statically (dynamically) typed language.
But I would assume you mean if it becomes dependently typed? Well, it depends on the type signature of the factory interface (as far as I can tell). If you see an associated type as Return Type, then it depends on whether the associated type changes in dependence to the passed module value(s).

Maybe you can present an code example for better clarification.

@NodixBlockchain

You speak about dynamic loading, right? Afaict, this is done over a void pointer in c/c++ and void is used in c for runtime coercion/runtime polymorphism. I believe java.Object is defined on top of it.

But it has not always been the case for. Look at dlang, where you can read files and mixin the yielded string to code (Never tried it). This requires that the compile time functions aka templates are (semi) dynamically typed at compile time.
Source 1
Source 2

It's not necessarily with void pointers, even if MFC probably use them.

It can be like

// common.h

Class Interface
{
  pure virtual void print();
}

class dynenv
{
      Interface **MyInstances;

     dynenv(num)
     {
          MyInstances = malloc(num*sizeof(Interface *));
     }
}


static class myfactory
{
   dynenv *build(int *type, size_t num);
}

//module.cpp

#include "common.h"

Class class1 : Interface
{
    void print()
    {
         printf("class1");
    }
}

Class class2 : Interface
{
    void print()
    {
         printf("class2");
    }
}


   dynenv  *myfactory::build(int *type, size_t num)
   {
       dynenv *MyEnv = new dynenv(num);

       while(num--)
       {
           switch(types[num])
           {  
               case 0:MyEnv->MyInstances[num] = new class1(); break;
               case 1:MyEnv->MyInstances[num] = new class2(); break;
            }
       }
       return MyEnv;
    }

main.cpp

#include "common.h"

int types[]={0,0,1};
int main()
{
   dynenv *MyEnv;

   MyEnv = myfactory::build(types,3);

}

Then MyInstances[0] can be safely type casted to class1.

Like

class1 *MyClass1 = (class1 *)MyEnv->MyInstances[0];

Because "types" is defined statically.

It can be made with templates in the factory too, to build different objects with more complex code, or be more complex than that.

In sort that you can have

class staticenv
{
    class1 *Instance1;
    class1 *Instance2;
    class2 *Instance3;
}

staticenv *env=new staticenv();

env->Instance1=(class1 *)MyEnv->MyInstances[0];
env->Instance2=(class1 *)MyEnv->MyInstances[1];
env->Instance3=(class2 *)MyEnv->MyInstances[2];

And then use env as if it was statically typed.

MFC is more complex because many things goes through the message proc and each instance has an ID mapped to a class and message handlers.

So there is more tricks than this, but the result is the same.

keean commented

@sighoya Personally I think reading files at compile time is a bad idea. Compiling should translate language A into language B without executing anything in language A.

By dynamic loading I mean loading code at runtime, like loading tool plugins for a painting program. We want to be able to distribute binary plugins, but the main program expects the plugin to confirm to a certain.type signature. We should be able to compile the main program and the plugin separately (on different computers at different times). We need to make sure both the main program and the plugin confirm to the same version of the interface. We can do this by cryptographically signing the plugin with a hash of the binary and a hash of the interface after type checking. The main program can then check the signature when loading.

@sighoya wrote:

Note: modularity and locality are two paradigms which are mostly contrary to each other.

The want to emphasize the burden of “fine-grained” (e.g. intertwined/tangled) in my prior post. Thus I’m pondering whether modules should be first-class. I’m studying Backpack. Also I had posited that perhaps the functionality gained by polymorphic variants can be modeled with typeclasses (aka adhoc polymorphism)? (see link in what I had written before as quoted below)

Polymorphmic variants (aka row polymorphism ← click for our past discussions

Adhoc polymorphism is not first-class. So not only do I think we need to minimize the interaction of concepts (especially first-class concepts), I think we shouldn't unnecessarily have more ways to do the same thing and exceed the Principle of Least Power.


@keean wrote:

Personally I think reading files at compile time is a bad idea.

Huh??? Do you mean compiling while the program is running and plugging in the code is bad?

By dynamic loading I mean loading code at runtime, like loading tool plugins for a painting program. We want to be able to distribute binary plugins, but the main program expects the plugin to confirm to a certain.type signature.

Agreed about typechecking. But why is it necessarily bad to compile JIT?

keean commented

@sighoya

Huh??? Do you mean compiling while the program is running and plugging in the code is bad?

Yes, I don't want to ship the compiler, type-checker and linker with production code. It takes a lot of memory, or means you have to install some environment onto the machine before you can use it (like the JVM).

I want a language that produces small binaries. If I write a program like this:

let a = readInt
let b = readInt
print string(a + b)

I would expect the machine code output to look something like this:

call readInt
move r0, r1
call readInt
add r0, r1, r2
push r2
call string
push r0
call print

plus the definitions of the functions actually used from the standard library.

Its also about predictability, not only is the JIT a big chunk of extra stuff to distribute, its also another potential source of errors.

I think it is the CPUs job to do runtime optimisation by speculative execution, branch prediction, instruction re-ordering etc.

@shelby3 wrote:

Also I had posited that perhaps the functionality gained by polymorphic variants can be modeled with typeclasses (aka adhoc polymorphism)?

typeclasses provide adhoc polymorphism as method overloading do. The difference is that type classes give the overloading of some function a name/sense which is import for those functional guys.
Further, typeclasses provide grouping, i.e. types which implements a typeclass belong to some group and each type in this group can safely passed to the same bundle of methods specified in this typeclass.

@shelby3 wrote:

Polymorphmic variants (aka row polymorphism ← click for our past discussions

Polymorphic variants subdivide into open record types(row polymorphism for records) and open sum types (row polymorphism for sum types aka data in Haskell)

Adhoc polymorphism is not first-class.

Why?
Method overloading can be seen as row polymorphism on the parameter-return tuple (last elem of this tuple is the return type) where each overloading represents a subtype which could be also generic, too.

Further, you can treat typeclasses as first class in your language (abstracting over them and so on)

@shelby3 wrote:

So not only do I think we need to minimize the interaction of concepts (especially first-class concepts), I think we shouldn't unnecessarily have more ways to do the same thing and exceed the Principle of Least Power.

What are concepts against to typeclasses?

Further, I thought @keean and you don't want typeclasses in favor of modules?

keean commented

@sighoya I like type-classes because they allow you to define an abstract algebra. I don't like them because they are anti-modular. Perhaps modules + implicits is the best way to get type-classes that are also modular?

@keean wrote:

I think it is the CPUs job to do runtime optimisation by speculative execution, branch prediction, instruction re-ordering etc.

You recently wrote in these issue threads that maybe CPU will have to abandon speculative execution because of the recent SPECTRE vulnerabilities, which seem inherently impossible to guard against if the CPU is highly non-deterministic. Although I remarked that neither of us are experts in the specialized field (although I’m sure you know a significant amount about it given you’re an electrical engineer by formal training).

Huh??? Do you mean compiling while the program is running and plugging in the code is bad?

Yes, I don't want to ship the compiler, type-checker and linker with production code. It takes a lot of memory, or means you have to install some environment onto the machine before you can use it (like the JVM).

I disagree. Security requires compilation be done on the client. Even Ken Thompson’s Trusting Trust paper applies, but you and I had argued this point about capability security before where you argued there’s no security with capabilities, but I argued for application level sandboxes. Perhaps you argued (or would argue) for hardware sandboxes (virtual machines) but that it too coarse grained.

Embedded is a special case, but even this will require sandboxes for hot updated over the wire with IoT.

Sorry I think your view is that of a dinosaur. Afaics, that ship sailed already. There’s no going back.


@sighoya wrote:

typeclasses provide adhoc polymorphism as method overloading do. The difference is that type classes give the overloading of some function a name/sense which is import for those functional guys.
Further, typeclasses provide grouping, i.e. types which implements a typeclass belong to some group and each type in this group can safely passed to the same bundle of methods specified in this typeclass.

Typeclasses also enable late (i.e. modular) binding of interfaces to types. Afaics, they’re also a partial solution to Wadler’s Expression Problem, with unions and immutability completing the solution. Immutability is required because unions introduce subtyping/subsumption into the typing system, but the new algebraic subtyping makes this sound and decidable by moving all the unions (all possible types) into the ground types instead of bounding all types by the top and bottom types.

As @keean points out, they also provide an abstract algebra because every type can only be implemented one way for each typeclass (although checking the implementations against axioms seems implausible, although Scalaz 8 claims to aim for this).

Polymorphmic variants (aka row polymorphism ← click for our past discussions

Polymorphic variants subdivide into open record types(row polymorphism for records) and open sum types (row polymorphism for sum types aka data in Haskell)

@keean had proposed to unify records, sums, and product types. I'm think row polymorphism just means conceptually a structure that implements a signature which doesn’t have all the members of that structure. If sums and records were unified, then ML functors could create “row polymorphism”. Am I missing something?

Adhoc polymorphism is not first-class.

Why?

Well we might argue that existential types (aka Rust’s trait objects) with a typeclass bound are first-class because they can be passed around as values, but afaik as @keean and @skaller have explained in the past, typeclass bounds aren’t types. They’re interfaces. They are orthogonal to the typing system. The unification of the types of the program can afaik be done orthogonal to the unification of[instance selection for] the typeclass bounds of the program.

First-class features can wreck havoc on type inference, soundness, and decidability of the type system. Typing systems become much more complex/problematic moving away from the origin on the Lambda Cube.

@keean you’re encouraged to correct me if this is an incorrect response.

Further, you can treat typeclasses as first class in your language (abstracting over them and so on)

No afaics, there’s no way to make a type which is a typeclass. The typeclass bounds always go in the requires (or in Haskell before the =>) clause and never where the types appear.

Further, I thought @keean and you don't want typeclasses in favor of modules?

Afaik, we don’t have consensus on that yet. We’re analyzing tradeoffs and requirements.


@keean wrote:

@sighoya I like type-classes because they allow you to define an abstract algebra. I don't like them because they are anti-modular. Perhaps modules + implicits is the best way to get type-classes that are also modular?

Can we enumerate the ways they aren’t modular so we have coherent summary here?

Afaik, the issues are:

  • Open type families ( ← blog post written by current maintainer of Backpack) are not modular. Refer back to discussion linked from the OP for where I cited the LtU discussion between @rossberg (co-creator of 1ML and MixML) and @keean on that topic. Perhaps we use abstract types instead (which are inherently open and modular) but does that mean ML functors and emulating typeclasses like in Modular Type Classes?

  • Overlapping instances with the decentralized (open) nature of modularity. However, I have proposed that any instance declared in the module where the typeclass (that it implements) was declared, or secondly any instance declared in the module where the datatype (it is implementing), orphan all other instances. Those instances are always exported so the compiler must eagerly check they don’t overlap. If we also allow instances defined in their own module, but only if not overlapping (and/or have a means to choose which of the overlapping instances is used globally such naming it explicitly in some metadata file), then we have to transitively name them or export all instances and eagerly detect overlap. Otherwise name the instance you want to use and inject it by name, but this breaks recursive inferencing and the abstract algebra. Afaics, this seems to mostly solve this issue.

keean commented

@shelby3

You recently wrote in these issue threads that maybe CPU will have to abandon speculative execution because of the recent SPECTRE vulnerabilities, which seem inherently impossible to guard against if the CPU is highly non-deterministic.

These kind of optimisations have to be done in the CPU, I think by redesigning the cache to separate data by security ring and process ID this can be fixed.

I disagree. Security requires compilation be done on the client. Even Ken Thompson’s Trusting Trust paper applies

Of course it doesnt. Java shows how this is a broken model.

Security needs to be a runtime capability based process enforced by hardware. The hardware is much more sophisticated these days, for example lightweight containers like docker.

Sorry I think your view is that of a dinosaur. Afaics, that ship sailed already. There’s no going back.

This is nonsensical, how has the ship sailed? What language is compiled on the client? There are plenty of interpreted languages like python and JavaScript, but unless you build Linux distributions from source, you are always dealing with pre-compiled binaries.

Can we enumerate the ways they aren’t modular so we have coherent summary here?

The main way is that they are global, which is true for an abstract algebra, because we want operators to mean the same thing wherever they are called from.

Can we enumerate the ways they aren’t modular so we have coherent summary here?

The main way is that they are global, which is true for an abstract algebra, because we want operators to mean the same thing wherever they are called from.

That doesn’t speak to which specific issues prevent them from being modularized. I was enumerating the specific issues.

I don’t understand what of value can be gained from your comment? What practical aspects am I supposed to infer from what you wrote?

You recently wrote in these issue threads that maybe CPU will have to abandon speculative execution because of the recent SPECTRE vulnerabilities, which seem inherently impossible to guard against if the CPU is highly non-deterministic.

These kind of optimisations have to be done in the CPU, I think by redesigning the cache to separate data by security ring and process ID this can be fixed.

I repeat that neither of us are sufficiently expert to know. But I presume timing attacks can’t be entirely isolated without preventing the sharing of cache between processes, because side-effects can be transmitted in very subtle ways. I read/skimmed the white papers cited by the SPECTRE paper. You could refer to for example all the trouble Daniel Berstein underwent just to remove timing attacks from one highly structured cryptographic algorithm curve25519. Now imagine unbounded diversity of algorithms. I think you were more likely correct when you formerly surmised that they’ll have to rip out all of the non-deterministic speculative optimizations in CPUs. Or that processes will need to run completely isolated on their own core+cache and communicate between very high latency message channels. Yet there are not enough cores to run all the processes. And dedicating an entire core to a process is also a performance hit. So maybe just ripping out the speculative execution will scale better.

Even Ken Thompson’s Trusting Trust paper applies

but unless you build Linux distributions from source, you are always dealing with pre-compiled binaries.

Did you even bother to understand what the Trusting Trust paper is about. You must rebuild from source else you can’t trust it. And you can’t even trust the compiler unless you build it from source. Distributing binaries is very insecure. We can trust our own trust when we recompile everything ourselves. Binaries don’t insure that the program was compiled faithfully.

I disagree. Security requires compilation be done on the client.

Of course it doesnt. Java shows how this is a broken model.

We already had this argument last year. I cited Joe-E which was designed for maximum security, which Java was not.

There’s no such thing as 100% security. That is the first principle any security expert will tell you.

A sandbox is the first-line of defense and it’s better than Trusting Trust and no first-line of defense.

And formal methods and correctness checking such as in Rust, may help improve the quality of sandboxes.

Security needs to be a runtime capability based process enforced by hardware. The hardware is much more sophisticated these days, for example lightweight containers like docker.

SPECTRE proved that such security is not 100%, even without factoring in human error and phishing.

Also as I wrote in my prior post, it’s too coarse-grained. Process-based isolation can’t do capability-based security.

Sorry I think your view is that of a dinosaur. Afaics, that ship sailed already. There’s no going back.

This is nonsensical, how has the ship sailed?

It sailed because of JavaScript and soon WASM. And because nobody wants to mess around with downloading and installing untrusted binaries, because it invokes too many opportunities to inject viruses and bad code. Managed sandboxes are not going away and will proliferate swallowing all forms of code distribution eventually. WASM is also all about exposing low-level while maintaining a security model. Running unmanaged and shipping different binaries for each hardware has sailed is never coming back. You’re simply incorrect. Sorry.

Whether sandboxes can unify around a higher-level language with even more security guarantees than WASM is an open question. Maybe the greatest common denominator of PL that are needed is at best something like WASM. I’m hoping a bit more high-level would be plausible. I think experimention on WASM will help the world figure that out.

What language is compiled on the client?

The entire point is that users want sandboxes with managed systems. Thus whether we compile on the client or not is irrelevant. The point is there’s nothing bad about compiling on the client. You were arguing about overhead of shipping compilers and managed systems, but that ship has sailed.

META: There you go again with personal insults. “Nonsensical” is a factual statement or more inciting hatred while being incorrect? You could leave off the word “nonsensical” and still make the same point. At least I wrote “sorry” meaning I wasn’t trying to unnecessarily slight you. I do think your view is outdated.

I’m going to break your preconceptions and cherished falsehoods.

For apps where performance and ressources matter, they still ship binairies, like video games, or multi media, even browser and bitcoin It's not going to run on managed code soon. Most of those are still made in c++.

Not all games need performance. My gf plays crosswords and candy crush on her mobile phone which aren’t performance critical. Twitter is a game. Steemit.com is a game. Etc..

Even for games that need for example 3D performance, they can increasingly offload performance needs on libraries such as OpenGL or Unity3D. Many real-time C++ games have been compiled to ASM.js (and soon WASM) to run in the browser. IOW, adding performance to games is more about parallelism than low-level semantics.

And the deployment to managed systems will inexorably increase because as the 150+ IQ literary genius Eric S. Raymond pointed out in his recent blog series analyzing Go vs. Rust, that even a GC language such as Go is now swallowing a significant chunk of the systems and server programming needs. Even Go doesn’t allow you to fiddle with the M:N green threads (goroutines) scheduler nor the GC, because its automation is sufficient for most/many use cases.

That’s why I been telling you for the past year or more that your expertise in “close the metal” knowledge was going to become less applicable to the mainstream as time goes on. Transition up the semantics layers or be a dinosaur.

Even having a VM or sandbox doesnt change much with security issue, it just move it into the VM.

This entirely misses the point I made in my prior post, which is the managed code has very high economies-of-scale and can be the best first-line defense possible. What else would you put as a first-line defense that can also do capability-grained security? Geez, you didn’t even comprehend what I wrote.

but even on android it's not that hard to root the device and there are many exploit to run out of the sandboxes, same for web browsers.

If it was so damn easy to break the security of web browsers, then why aren’t my Bitcoins stolen already? I keep my wallet on my computer. And I load many, many websites and PDF files. But I am not easily fooled by phishing.

Android’s security model is designed to protect against unintentional errors in app programming, not intentional malware. It was not designed as a hardened system against an app that desires to do evil. The Playstore exists to provide trust reputations on vendors. So together with that trust system, we want the security model to protect against unintentional mistakes of the developers.

Whereas, the browser loads websites that have no reputation or trust model, so the security model must be hardened.

Binaries can be signed by a trusted source, and even compiling the sources doesnt garantee there is no bug or problem in the source code.

Signing doesn’t guarantee that the developers didn’t accidentally inject a bug in a new release that creates a security issue. So the Android security model provides some protection against such security breaches. JIT V8 compiling on the client such as for JavaScript coupled with a hardened security model, is applicable where reputation and trust can’t exist because we visit new websites instantaneously.

All attempts at doing optimisation and management of performance / execution ordering etc from compilers instead of cpu have been faillures.

Incorrect. Java Hotspot improves performance the longer the application runs. In fact, it can use higher-level knowledge that the CPU is oblivious to.

Didnt read much in depth about spectre etc but i don't see how sandbox or managed code can solve this.

They can’t. That is why the only solution will likely be process level isolation or abandoning speculative execution, both of which afaics will be a reduction in performance and battery life.

But that’s irrelevant to the point that process level security is not capability-level security. And that process level security should not be the only first-line defense.

keean commented

@shelby3 Please note WASM is a move back to pre-compiled code from JavaScript.

You are right that sand-boxes on the client are important (see my comments about docker), but I think you are wrong about people wanting to compile source code in the client.

@keean, afaics the point of this discussion was whether users have any reason to object to automatic compilation of code on their client. Clearly the ubiquity of JavaScript has demonstrated they don’t. Afaik users hate dealing with binaries. I hate dealing with binaries on my Linux system, because of dependencies (especially on Linux that need to be compiled) for example versioning crap meant I was never able to figure how to get GCC to configured properly (Clang is working). I will probably have to build a new Ubuntu system probably to get it to work. So the point is that binaries also suck for users. Automated compilation with dependencies that always work, is what users love.

The issue for users is convenience (and possibly compile times). One of the members of the Go team was talking about how the compiler is so fast, that it blurs the distinction between interpreted and compiled. I’ve also read that the OCaml compiler is very fast. There’s also a discussion from the creator of D about the comparison of how much slower and rigor mortis (especially the build process) C++ is. At 21:00 minutes they summarize Rust, Go, C++, and D.

Now I realize that some developers will precompile JavaScript to WASM in order to make load times faster, but this is probably justified by I presume JavaScript offering no additional security that WASM doesn’t provide. And this won’t preclude compiling some JavaScript to WASM client-side.

I think WASM will facilitate compiling various programming languages client-side, because there won’t be any need to configure the compiler differently for different computers.

Flawless portability is huge factor in automation and user convenience. WASM is more important than people realize. But so far WASM doesn’t support GC nor goroutines natively.

(note I am very sleepy and can not give this the best reply due to being too groggy, may edit after I sleep)

keean commented

I don't think you would compile JS to WASM, as WASM has unmanaged memory. One of the big things to come out of Mozilla's Rust Browser project was that having two separate garbage collectors was inefficient. It turns out to be a lot better to pass all allocations through to the same GC. It may happen if WASM gets managed memory, allocated by the main JS GC. Otherwise it will be more better for performance to just use normal JS, minimised and gziped for fast download.

@shelby3 wrote

Afaik users hate dealing with binaries. I hate dealing with binaries on my Linux system, because of dependencies (especially on Linux that need to be compiled) for example versioning crap meant I was never able to figure how to get GCC to configured properly. I will probably have to build a new Ubuntu system probably to get it to work. So the point is that binaries also suck for users. Automated compilation with dependencies that always work, is what users love.

Instead, users should deal with source code? What is with enterprises which do not want allow to see their code?

@shelby3 wrote:

No afaics, there’s no way to make a type which is a typeclass. The typeclass bounds always go in the requires (or in Haskell before the =>) clause and never where the types appear.

It depends how you see type classes. You can see it as a predicate on a type parameter p.
But there other ways to see type classes:

1.) See typeclass C as a struct, then a module m (which is then a typeclass instance) has a typeclass as type:

fun(m:C)=m

2.) See a typeclass as a set of types:


, then a typeclass could be a type of a type T.

fun(x:T:C)=x

3.) See a typeclass as a union of all types:


, then a typeclass could be a type of a value x.

fun(x:C)=x

Even in the case of the three alternatives, you could define predicates, too on type parameters (via. =>) if you want, i.e. (=>):typeclass->type->Bool is a predicate which could be put into a where clause:

fun(x:T) where C=>T

Edit:
Note: the types I mention here (Int, Float,...) "instantiate" the typeclass C. Wether "instatiate" mean a typeassert(isa relationship) depends how you see typeclasses

Edit 2:
As you might see in 2.) and 3.), typeclasses could described as row polymorphic record ( = struct without order?) (C={Thing1,Thing2| t}), typeclass subtyping can the be naturally defined via row polymorphism.

Please refresh, made some edits...

keean commented

@sighoya Using structural typing and row polymorphism we could give a type-class a type, which would basically be;

forall a, b, r . {f : a -> b, g : b -> a | r}

This is actually a row polymorphic type that is a partial specification of a module signature. We could then declare a generic function:

f(?classname{f : a -> b | r}, a) -> b

Of course this is useless because it will match any unary function in any module in scope.

To make this work we probably need a name-kind (which we probably need to implement structural record types anyway) which I have given as "classname" above.

@keean wrote:

I don't think you would compile JS to WASM, as WASM has unmanaged memory.

I’ve read others imply that adding GC to WASM is on the todo list. Perhaps I also remember reading something about that on the official Github.

One of the big things to come out of Mozilla's Rust Browser project was that having two separate garbage collectors was inefficient. It turns out to be a lot better to pass all allocations through to the same GC. It may happen if WASM gets managed memory, allocated by the main JS GC.

They must add it to WASM else it can’t be an optimal target for GC languages.


@sighoya wrote:

Instead, users should deal with source code? What is with enterprises which do not want allow to see their code?

Do users deal with JavaScript code when they access a website. Is open source not killing closed source. Can compiled code not be reverse engineered/decompiled. IMO, these are not questions.

But there other ways to see type classes:

I also made an analogous argument (to the one you’re presenting) when this was discussed between @keean and @skaller, but I have focused in on what I believe to be the key (generative essence) distinction which is repeated below…


@keean wrote:

Using structural typing and row polymorphism we could give a type-class a type, which would basically be;

Or analogously as I wrote before, we can view ML functors as giving a type to a signature and we know we can emulate typeclasses with Modular Type Classes.

But am I correct that with Haskell’s typeclasses we can do the HM-like unification orthogonal to the resolution of the typeclass bounds? Because if I am, then that’s pointing to a conflation with the typing system if we emulate typeclasses with modules? Why would that be undesirable? My thought is that typeclasses as orthogonal to the type system simplifies the type system. Separation-of-concerns. By putting more of our needs on typeclasses and keeping the type system simpler, then we’re more likely to have less soundness and corner cases issues with the type system.


@NodixBlockchain wrote:

One thing that make js safer is no local fs access. For video games or multi media its not very practical.

Files can be emulated with persistent storage in HTML5 for caching/speed. The network is the file system. Note that to Nielsen’s law network bandwidth (and ever less so latency) doesn’t scale with Moore’s Law though. In general though latency does not scale with Moore’s law, not even locally for memory (which was the basis of memory-hard hash functions). Does anyone know if network latency is scaling faster or slower than localized latency?

Still have to see good games in the browser with advantage over native binaries.

You’re still failing to comprehend what I wrote. The point is that games could be compiled automatically on the client and a performance-tilted, game-oriented sandbox could be created. That’s where we are headed. Mark this as a prediction.

For VR the leading engine is still unreal which is c++ with native binaries.

It will be disintermediated. I might even end up being the one who facilitates it with my decentralized ledger work.

With my system module binaries can be moved between Windows and linux ;-) the module binaries can be used on most intel linux and windows without recompilation.

And to be frank, I emphatically reject your approach. That is why I’m not interested in your machine-code-level framework and most of the points you push on the issues thread. It’s not a personalization. Just don’t agree with your technological focus.

I used many non bare metal systems too, java, as3,c#, vb, PHP, python etc.

Which all mostly suck. I mean they aren’t useless, but for example C# is another kitchen-sink language like C++. Everything is thrown in there, with no holistic pruning to the Principle of Least Power and separation-of-concerns. Note Haskell and probably Multicore OCaml don’t mostly suck (both of which have GC), but I think we can perhaps improve on those. Also Go doesn’t suck at scaling M:N (but as the D creator pointed out in the video I linked above, it sucks for modeling invariants due to the weak type system).

As for security, Erik Meijer reminds us that JavaScript is not a safe language:

Another thing is JavaScript arrays which are not really arrays, if you go out of bounds of an array, you add the elements.

Also somewhere in these issue threads I had pointed out that we not only need to fix JavaScript’s concatenation operator (which overloads the supposed to be commutative + with implicit conversions) but for security we need distinctly separate start and end quote marks (and make these difficult to type so they never occur in normal prose) so these can be properly nested to avoid injection attacks with eval (or even if compiled).

A friend works in a Company where they develop a server platform in java + web, they struggle a lot with performance, compared to adobe solution made more with native code.

That’s not surprising.

I still tend to think the going all To managed code and sandbox is more currently not that great. It was a big promise 10/15 years ago with managed code, GC, etc but i think there are still walls for ressources and performance for many applications.

The chances of you being correct and Eric S. Raymond (the progenitor of the open source movement) being incorrect, are pretty close to ~0 in my opinion. C.f. the WD-40 #35 thread.

The first wave of experiments for managed languages were not that good. That shouldn’t be misconstrued as the death and failure of managed languages. ESR is correct that they’re the future. That’s why I’m here discussing PL design to try to fulfill ESR’s implicit call-to-action w.r.t. to the future of programming languages. That is why for example, I recently invented the concept of a cordoned nursery with explicit reference counting otherwise, as hopefully a future experiment on eliminating the performance reduction of a write-barrier with a fast automatic nursery GC.

The future bifurcates into something like Rust with improved provable safety but very low-level and complicated, yet hopefully an improvement on C++ (note the typeclasses instead of templates). Or something like what I am designing in collaboration with everyone here (tentatively named Zero) to hopefully improve upon the concepts in Haskell, Multicore OCaml, and Go, providing hopefully a compelling balance between convenience, readability, correctness, and performance.

So I see you as useful in the collaboration because (at least) you play the role of the devil’s advocate, given you’re religiously against managed code. @keean afaics leans at times to more of a Rust-like solution (although he’s a connoisseur of programming languages of all types); whereas, I lean more towards GC but attempting to meet you guys mid-way with my cordoned nursery invention.

Many will still be looking for solution that are more lightweight and dont need 3gb of framework and a week of setup, and better performance than let say java/c# servlet + web front end. And i dont think wasm will entierely solve this.

WASM is headed there. Rome wasn’t built in one day.

No 1 week to configure WASM. It’s already configured in your browser automatically. Economies-of-scale. Economics and game theory matters.

Most of the very Secure things still run with native code. Like database server, and all entreprise software for edition are still using mostly native code.

The have the economies-of-scale to be part of the managed platform itself. They will be coded with formal methods and linear types a la Rust.

@NodixBlockchain, I already wrote hardware process-level cordoned security is not capabilities security. Managed code can provide the latter. Security has varied perspectives. From the perspective of the application developer, yes maybe native can provide more security than relying on some poorly designed and buggy VM, but from the users’ and system protection perspective where the developer is not trusted or can make mistakes unwittingly, then managed code is an extra line of defense.

@NodixBlockchain wrote:

https://research.google.com/pubs/archive/34913.pdf

This paper show how its possible To run untrusted native code in sand boxed environnement.

  1. That paper is a managed code system, e.g. if it’s preventing certain APIs such as reading and writing files. Just because it’s using hardware faults to help implement it’s software protection mechanism, doesn’t mean it isn’t a form of managed code.

  2. It’s not fine-grained. It is not protecting against capabilities based security threats from for example overwriting the end of an array and then reading from it unwittingly from within the same process. It’s not enforcing the same-origin policy of the browser, etc..

@keean and @shelby3
What do you think about Named Type Class Instances where each typeclass instance have to be "opened" (with "using" for instance) at first in order to dispatch on it and if you want, you could specify an alternative dispatch by providing the name of the instance explicitly.

This should happen all at compile time, so no dependent typing at runtime like for modules.

keean commented

@sighoya I think that's a reasonable solution, but it may be better to go for modules and functions with implicit parameters. I'm not sure.

@NodixBlockchain wrote:

My approach on this is different, as i prefer To have function code using
explicit type, but parameters like variant whose value can be converted To
the type that the function uses.

It's why i would prefer to have function code made with explicit type, with
parameters that can be converted by the function depending on what type it
uses for its computation.

It's why i would prefer to have function code made with explicit type, with
parameters that can be converted by the function depending on what type it
uses for its computation.

It sounds like bit as "explicit conversion/casting". You have a function which expects a Float as argument and you pass an Integer, Rational or a Complex Number instead which gets converted with some custom algorithm through typeclass instantiation.

Even though you use static typeclass instatiation to achieve this you get runtime costs to convert types to a custom type. Mostly it is better to fit the type of the computation function to the other desired types directly instead of casting/conversion algorithms.

@NodixBlockchain wrote:

Then eventually defining interfaces using type class to specify which type
can be converted to the type used by the function.

For composed types, i would easily go for a system of traits like rust, to
have interfaces who can take any class that has the required members as
parameters.

Trait is another name for typeclass, though the power of most implemented trait system does not achieve the power of haskells implemented typeclass system.

I see you want to use typeclasses but how do you solve:

1.) the global uniqueness problem
2.) to have multiple instances of the same type

@NodixBlockchain wrote

Like this the message loop code can remain 100% generic and works with a
system of registering modules at runtime to handle certain type of messages
based on the abstract type.

If you use modules you have to deal with dependent typing or did you mean something different?

keean commented

@NodixBlockchain sounds like implicit casting to me, which I think is a bad idea. The implicit conversion in C++ is one of the things that make overloading and specialisation complex and hard to understand. Personally I prefer to have explicit conversions even between integers and floats.

@keean wrote:

@sighoya wrote:

What do you think about Named Type Class Instances [Named Instances for Haskell Type Classes by Wolfram Kahl & Jan Scheffczyk] where each typeclass instance have to be "opened" (with "using" for instance) at first in order to dispatch on it and if you want, you could specify an alternative dispatch by providing the name of the instance explicitly.

I think that's a reasonable solution, but it may be better to go for modules and functions with implicit parameters. I'm not sure.

  1. W.r.t. to the desire to be able to have different Orderings, perhaps a more coherent design would be to have allow typeclasses to inherit from other typeclasses, so that ReverseOrdering, ForwardOrdering, and No38thPrimeOrdering can all inherit from (aka implement) Ordering. So if you want to dictate a more specific Ordering, then wrap the call to the function that takes an Ordering in a function which takes the specific kind of Ordering you want. This is my new idea (as of today), and I don’t know if there is prior art? Doesn’t Haskell already have some form of inheritance of typeclasses or instances? The advantage of this over using style implicits, it that functions compose but using does not. Also keeping the selection in function signatures is more coherent. Also only allowing selection which conforms to inheritance maintains an abstract algebra for the base typeclass. However, there might be a problem with different specializations of Ordering becoming mixed in an algorithm causing weird results? But that will happen also with any naming of instances.

  2. W.r.t. to the desire to name which instances are exported and force eager detection of overlapping instances (an eagerness that was avoided in general in Haskell by simonpj), note that forces (at least in the presence of type families which are unsound if not eagerly unique) a transitive naming which is quite anti-modular. See the last bullet point in my prior comments about this.

@sighoya wrote:

This should happen all at compile time, so no dependent typing at runtime like for modules.

I think to some extent this dependent-typing issue is overblown or its necessity not coherently explained in one concise explanation. As we discussed upthread, typeclass bounds can also be employed to create runtime dispatch with existential types (aka Rust’s trait objects). And in theory modular abstract types could be statically resolved when restricted from not being employed first-class and only on function bounds and/or could in theory be statically traced to function bounds if the code is pure (aka referentially transparent).

Open type families are anti-modular. There is a link to where @rossberg was making this point to @keean. IOW, type families of associated types are top-down dictated at function bounds, but since they are input types (i.e. top-down), they are anti-modular. Whereas, abstract types being output types (output by the instance of the modular) are modular. We get performance with static resolution but we get unsoundness due to modularity with static resolution if we can’t guarantee that all instances are non-overlapping. Monomorphisation is in tension with modularity. We have some design choices with tradeoffs. If we want static resolution of associated types of typeclasses in a modular context, then we need to adopt some mechanism for eagerly insuring that instances are non-overlapping.

META: I really, really wish Github had feature where we could view an issue thread with some users’ posts filtered/removed from the view. At least as an option when wanting to focus on those posters of more interest (because for one reason Github will insert “Load more…” in the middle of the thread when the number of posts exceed a couple hundred). I’m against censorship because all discussion is useful in some context, yet highly in favor decentralized curation/moderation as the context varies for each user and situation of use. And as for the users here who stubbornly refuse to delete their “please refresh” posts thus polluting the thread, it’s not very indicative of being a meticulous coder with attention to detail.

keean commented

@shelby3 interface inheritance is already a thing with type classes, it's just a constraint on the typeclass, so:

typeclass (Ordering a) => ReverseOrdering a where
   ...

In Haskell means ReverseOrdering inherits Ordering for all types 'a'.

You can also have instance constraints;

instance (Ordering (Array a)) => ReverseOrdering (Array a) where
   ...

Which means only the implementation of Ordering for Arrays requires ReverseOrdering.

However whilst this allows you to define Reverse and Forward orderings that inherit the interface from Ordering, they would have to have some different function to call them (as they extend ordering, and they inherit the interface not the implementation). A different typeclass cannot overload the same function. Also as type classes are not first class there is no way to pass different ones into a "sort" function, they are always implicit.

What is the difference between abstract type and associated type?
Googling a bit lead me to the same concept?

@keean wrote:

However whilst this allows you to define Reverse and Forward orderings that inherit the interface from Ordering, they would have to have some different function to call them (as they extend ordering, and they inherit the interface not the implementation). A different typeclass cannot overload the same function.

This explanation could be read as slightly ambiguous. Does a ReverseOrdering function bound refuse to accept an Ordering instance? Does an Ordering function bound accept a ReverseOrdering instance? If yes, then per my idea we can create a function sortRev which wraps sort and requires a ReverseOrdering, then calls sort.

Also as type classes are not first class there is no way to pass different ones into a "sort" function, they are always implicit.

As I explained, they can’t be first-class without violating the abstract algebra and requiring runtime dispatch.

My idea is that if you want to pass ReverseOrdering to sort, then wrap sort in a function. What is the disadvantage of that? We have to decide if we want first-class modularity or not. If we do, then we must have runtime dispatch. Should we have both?


@sighoya wrote:

What is the difference between abstract type and associated type? Googling a bit lead me to the same concept?

An abstract type is an output type. You do not feed it in from the top-down like a type parameter. The instance feeds it to the first-class use of the modular signature. That is why you need runtime dispatch but it enables modularity and violates an abstract algebra. An associated type is like a top-down specified type parameter of typeclasses that is not used in instance selection, i.e. the instances cooperate to provide non-overlapping (i.e. consistent) types for themselves and these can be resolved statically. This enforces an abstract algebra (at least for the base typeclasses?). See the explanation by @rossberg that I linked to in my prior post, which has examples.

keean commented

Let's say that 'Ordering' defines the '<' operator, ReverseOrdering cannot overload this, so we have to use a different operator like '>' for reverse ordering. If we write sort so that it uses the '<' operator for comparison it is automatically implied that it requires the 'Ordering' typeclass.

So typeclasses control overloading, and there must be a unique mapping from any overloaded function to a single typeclass. Otherwise typeclass resolution breaks down.

Remember 'first class's means we can pass it as an argument to a function. So the fact that type-classes are not first-class means we have no way to pass them to functions. The only way to use a type class is implicitly, IE you just use the '<' operator and the compiler infers that this needs the 'Ordering' typeclass.

Let's say that Ordering defines the < operator, ReverseOrdering cannot overload this, so we have to use a different operator like > for reverse ordering.

Oic, then it’s not providing for my idea. And probably the reason for that is because:

However, there might be a problem with different specializations of Ordering becoming mixed in an algorithm causing weird results? But that will happen also with any naming of instances.

This enforces an abstract algebra (at least for the base typeclasses?)

Your solution for sort orderings modularity with Haskell was to use the newtype hack?


So typeclasses control overloading, and there must be a unique mapping from any overloaded function to a single typeclass. Otherwise typeclass resolution breaks down.

Afaics, that is only an issue if the requires … (aka afaik (…) => in Haskell) function bound is implicit. The more fundamental reason for the restriction is probably what I wrote above.

Remember ‘first class’s means we can pass it as an argument to a function. So the fact that type-classes are not first-class means we have no way to pass them to functions. The only way to use a type class is implicitly, IE you just use the < operator and the compiler infers that this needs the Ordering typeclass.

How could you possibly think I do not know that given that is essentially what I have already written and implied in my posts today. Perhaps you’re just clarifying for other readers.

keean commented

If you want to control the ordering from outside the sort function then you have to pass something in to do so. You could pass the comparison function as an argument, you could pass an ordering module (with first class modules) or you could redefine an effect handler, which would get passed as an implicit parameter.

This means that scoped type-classes could be a solution, but I'm not sure I like it.

If you want to control the ordering from outside the sort function then you have to pass something in to do so.

My idea of wrapping in a function also would work if supported, because typeclass bound resolution is hierarchical. But as you implied, this would require explicit requires bounds on functions to differentiate between instance method names, and it would have the capability of corrupting the abstract algebra (but all injection of varying instances will be able to do that).

You could pass the comparison function as an argument

That breaks the generic ability of instances to implement the ordering function, which is the entire point of typeclasses. IOW, that kills genericity.

you could pass an ordering module (with first class modules)

Which breaks inferencing that makes typeclasses so elegant and breaks monomorphisation of associated types (have to employ abstract types). Whereas, afaics my idea provides essentially the same functionality without breaking inferencing and monomorphisation.

I mean without breaking the inferencing of which instance is selected, i.e. instance inference is preserved with my idea. My idea does break inferences of the requires function bounds, but I actually prefer the annotation.

you could redefine an effect handler, which would get passed as an implicit parameter
This means that scoped type-classes could be a solution, but I'm not sure I like it.

Implicits (using) don’t functionally compose (and probably also create other weirdness such as the experience with Scala’s implicits). They also don’t prevent overlapping instances (thus violating the abstract algebra consistency and unsoundness with type families) unless the transitive eager anti-modularity is employed. I had written about this in one of my posts today.

keean commented

@shelby3

explicit requires bounds on functions to differentiate between instance method names

I tried to explain that you cannot do that with type-classes you can only check that the explicit bound is the correct one for the functions used.

You cannot define two type-classes with the same function.

Of course we can imagine that you could implement this by passing a different dictionary, but the mechanics of constraint resolution would be different. I'm not saying it cant work, but I am saying what you have would no longer be a type-class. I am also not saying it definitely would work either :-)

keean commented

@NodixBlockchain implicit conversion makes overloading and specialisation more complex and difficult to understand.

JSON is just a string, it needs to be converted to values by a parser, that parser could do any conversion you wanted, you don't need to embed implicit conversions into the language semantics.

I tried to explain that you cannot do that with type-classes you can only check that the explicit bound is the correct one for the functions used.

I’m explaining my idea, which is a modification. It’s perplexing to me why can’t you realize that from what I wrote? Maybe you’re in a rush multitasking or late in the evening (10pm) for you bedtime.

I think I clearly explained that all the ideas for how to vary the sort ordering are a threat to the consistency of the abstract algebra. Either we forsake such functionality, resort to some newtype hackery, or we dig down into the possibilities (which is what I am doing).

Of course we can imagine that you could implement this by passing a different dictionary, but the mechanics of constraint resolution would be different.

Incorrect on the implication of a runtime dictionary if that was your implied intent. I already explained it would be monomorphised (except in those cases that existing typeclass bounds can’t be monomorphised such as for HRT callbacks and existential types).

The monomorphic instance selection constraint resolution might be different. Pondering that and whether that would break the consistency of the abstract algebra.

I'm not saying it cant work, but I am saying what you have would no longer be a type-class. I am also not saying it definitely would work either :-)

I think you should slow down and think first before insinuating that someone’s new idea is incorrect in some way. I am busy pondering about the cases where it is still exactly a typeclass (in that it preserves the consistency of the abstract algebra) and the cases where it does not.

After discussing this with @keean in private messaging, I realized I had omitted some key details about my idea, which caused him (and probably other readers) to have a misunderstanding about what I was proposing.

@keean was thinking that if there are typeclasses ForwardOrdering, ReverseOrdering, No38thPrimeOrdering that inherit from and can implement the same < method of the Ordering, then a sort function which contains a call to an < method would have ambiguous resolution, because instances implementing all of those aforementioned would also implement Ordering.

I clarified that my idea would disallow structured inference of the typeclass bounds and instead the sort function would have to be annotated explicitly with requires Ordering and sortRev explicitly annotated with requires ReverseOrdering.

Additionally I stipulated that my idea requires that any call (aka application) to sort that is not contained within a function with an explicit typeclass bound that inherits from Ordering, would only be able to supply implementation instances of Ordering and not instances of ForwardOrdering, ReverseOrdering, and No38thPrimeOrdering. Thus a sortRev with an explicit requires ReverseOrdering typeclass bound can call sort with a ReverseOrdering instance.

So if you call sort not inside a function with requires ForwardOrdering, then you get an Ordering instance selection implementation. But if you call sort inside a function with requires ForwardOrdering, then you get a ForwardOrdering instance selection implementation.

Thus I think this resolves @keean’s criticisms. I await his analysis of this idea to see if he or I can think of any flaws. I am pondering whether existential types with a typeclass bound of Ordering could be a flaw in my idea? Will sleep and comment more on that after I awaken.


EDIT: in private @keean wanted something similar to my idea but he did not want to declare which typeclasses inherit from Ordering. But the problem then would become that if sort requires ForwardOrdering then sortRev requires ReverseOrdering could call sort. Which is broken semantics and broken modularity.

@keean said he didn’t like classes of typeclasses because he felt it would form a hierarchy and I said OOP is a different concept. He cited this:

http://raganwald.com/2014/03/31/class-hierarchies-dont-do-that.html#the-semantic-problem-with-hierarchies

But that is an argument about objects. Typeclasses are not objects, they’re interfaces and only at function bounds (and I already said my idea will not work correctly with existential types aka Rust’s trait objects). Also subclassed objects have rigid hierarchies that can’t be modified after the object instance is created. Whereas, the hierarchies I have proposed for typeclasses are open and can be mixed, matched, and created at any point in the program for preexisting object instances.

http://raganwald.com/2014/03/31/class-hierarchies-dont-do-that.html#superclasses-are-not-encapsulated

To the extent that typeclasses do not have that problem, my idea does not add that problem.

The rectangle.setWidth/setHeight versus it’s subclass square.setWidthHeight problem is also present with typeclasses.

An additive semigroup is both a ring and a Monoid.

What part of the following statement I made was not clear or is inapplicable/incorrect?

Also subclassed objects have rigid hierarchies that can’t be modified after the object instance is created. Whereas, the hierarchies I have proposed for typeclasses are open and can be mixed, matched, and created at any point in the program for preexisting object instances.

@shelby3 wrote:

After discussing this with @keean in private messaging, I realized I had omitted some key details about my idea, which caused him (and probably other readers) to have a misunderstanding about what I was proposing.

@keean was thinking that if there are typeclasses ForwardOrdering,ReverseOrdering,No38thPrimeOrderingthat inherit from and can implement the same<method of theOrdering, then asortfunction which contains a call to an<method would have ambiguous resolution, because all of those aforementioned would be implementationOrdering`.

Your original idea, roughly summarized here, was to include inheritance semantics from OOP to type classes which is not possible because type classes are more powerfull by allowing multiple instances in the typeclass tree even if you want to choose the most specialized version of sort it could be ambigious because the deepest leaf layer in the typeclass heirarchy could also contain the same type multiple wise.

@shelby3 wrote:

I clarified that my idea would disallow structuring inferring the typeclass bounds and instead the sort function would have to be annotated explicitly with requires Ordering and sortRev explicitly annotated with requires ReverseOrdering.

Additionally I stipulated that my idea requires that any call to sort that is not contained within a function with an explicit typeclass bound that inherits from Ordering would only be able to supply implementation instances of Ordering and not instances of ForwardOrdering,ReverseOrdering,No38thPrimeOrdering. Thus asortRevwith an explicitrequires ReverseOrderingtypeclass bound can callsortwith aReverseOrdering` instance.

If you have sortRev for reverse ordering and sort for normal ordering than you eliminate the ambiguities by different name resolution so you don't need to use explicit requires.

But maybe you mean that you have some sort function which is in standard only applicable for Ordering instances but can in some circumstances be applied for ReverseOrdering instances with explicit requires.

This, in my eyes, is too complicated and reminds me to often that requires is another form for using in the sense of typeclass specialization. So you could do the same with named instances instead of complex typeclass inheritance hierarchies.

But I may do not understand you...

@sighoya wrote:

Your original idea, roughly summarized here, was to include inheritance semantics from OOP to type classes which is not possible because type classes are more powerfull by allowing multiple instances in the typeclass tree even if you want to choose the most specialized version of sort it could be ambigious because the deepest leaf layer in the typeclass heirarchy could also contain the same type multiple wise.

Apology for creating misunderstanding with insufficient explanations of my idea up to now, but yours above as quoted is an incorrect summary of my original idea. Where I used the word ‘inheritance’, I only meant that for example that instances that implement a ReverseOrdering implement the methods of an Ordering. This change requires that we no longer can do structural typing on the < method in functions to infer the typeclass bound of a function (and instead must use an explicit requires …) so that instances of ReverseOrdering don’t overlap instances of Ordering (as I clarified in my prior post). I did not propose to change anything about the late binding advantage of typeclasses. ReverseOrdering is another typeclass, not an implementation instance. The implementation instances of ReverseOrdering implement the methods of an Ordering. My idea is a function definition scoping mechanism at the definition of a function’s typeclass bound which impacts the late binding of instance selection at the call (function application) site.

If you have sortRev for reverse ordering and sort for normal ordering than you eliminate the ambiguities by different name resolution so you don't need to use explicit requires.

We must use an explicit requires ReverseOrdering on sortRev and requires Ordering on sort, because otherwise instances of both of the said typeclasses would implement the < method calls in those functions making instance resolution ambiguous.

This, in my eyes, is too complicated and reminds me to often that requires is another form for using in the sense of typeclass specialization.

The requires is annotating what inference would otherwise do when not ambiguous. Haskell’s typeclass design prevents ambiguity thus can I presume infer the requires structurally. My idea sacrifices inference of the typeclass bound at the function definition site in exchange for the ability to enable the desired flexibility.

Afaics, your using suggestion (i.e. the Named Type Classes) doesn’t compose functionally. The lexical scope of the using is not tied to a function definition, but rather to the lexical position of a function application (i.e. call site). And using is tied to named implementations of typeclasses; whereas, my idea is not selecting implementations (by name) but rather selecting from typeclasses (which are “subtypeclasses”, not OOP subclasses) of typeclasses. Thus afaics using legalizes the destruction of modularity by using incompatible named implementations; whereas,[I’m not quite clear if that paper isn’t allowing partial orders via type parametrised instances which the paper names “modules”, in which case it is similar to my idea and I may have conflated with the using in the Modular Type Classes paper] so far I cannot see how my idea would enable incompatible implementations as long as eager detection of overlapping implementations is employed per one of the ideas I already suggested for attaining modularity with typeclasses. Since my idea is making the subtypeclass selection at a function definition, that’s turtles all the way down the function application hierarchy (contained within the said function definition) so afaics there’s no way to introduce an incompatible implementation down that hierarchy. I’m leveraging the fact that function application composes and is hierarchical containment. The using paradigm doesn’t have the same attributes.

Existentially Quantified Types

The one way I can envision leaking incompatible implementations or losing some of the degrees-of-freedom in my idea is with existentially quantified types (aka Rust’s trait objects), because the existentially quantified type is first-class and can be passed into a function as an argument, yet the binding of type class requirements to objects (e.g. a collection of existentially quantified objects) is at the function call site and can thus change if there’s no one global abstract algebra for instances. So for example the existentially quantified type is the value type of a collection then in one function scope ReverseOrdering objects could be added to the Ordering existential type and in another function scope ForwardOrdering could be added to the same first-class collection instance for an Ordering existential value type — or the existentially quantified type is bound to one of the said subclasstypes so can’t interopt with the other said function scope. [Although Ordering isn’t even applicable to an existential type because how can you order a dynamically heterogeneous type, yet the point remains for non-multiparameter type class such as Monoid.] Perhaps we should just disallow subtypeclasses of typeclass bounds on existentially quantified types. Yet subtypeclasses bounds could be allowed on existentially quantified types no problem (which is apparently what the Named Type Classes paper adopted), which thus punts to said lack of interoperability.

@keean could you help me appraise Backpack creator Edward Z. Yang's (@ezyang) thesis criticism of typeclasses as being inadequate for modularity as follows quoted from CHAPTER 1. INTRODUCTION pg 2:

Unfortunately, type classes are ill-suited for certain cases of separate modular development:

  • From a code perspective, type class parametric code is often harder to use than monomorphic code. For an inexperienced Haskeller, the proliferation of constraints and type parameters in the type signatures of functions can make an otherwise straightforward API impenetrable:

    (=~) :: (RegexMaker Regex CompOption ExecOption source,
    RegexContext Regex source1 target)
    => source1 -> source -> target -- from regex-posix-0.95.2
    

Is his point that we’d prefer to have the required interfaces declared at module scope so that the requirements aren’t repeated on each function in the module which requires them? But when functions don’t share the same interface requirements, we would have to unbundle them into separate modules which would be noiser. Am I missing his point? I must complain he sure didn’t explain it very thoroughly, although I realize his audience are experts who are already familiar with the issues.

Furthermore, type classes work best when exactly a single type parameter is involved in instance resolution. If there aren’t any type parameters, you must introduce a proxy type to drive instance resolution; if there are multiple parameters [5], you often need to resolve ambiguity by introducing functional dependencies [12] or replacing parameters with associated types [24].

Okay I understand the point about proxy type needed to select an implementation of interface which is not operating on an input parameter argument of the function. The dummy proxy types are then acting like applying a functor because they can select an implementation of the interface. But what’s the complaint? It’s just another way of achieving the same functionality.

As for the issue with multiparameter typeclasses, associated types with type families, and functional dependencies, that was covered in the Pokemon example I cited upthread. Afaics, the issue of associated types versus abstract types as I explained upthread, really boils down to whether we want to allow conflicting overlapping instances and break the concept of an abstract algebra. Because (per the analogy to the last paragraph in my prior post) once we allow modules as first class we can’t guarantee non-overlapping typeclasses unless we employ the stratification employed in Modular Type Classes wherein typeclasses and implementations can’t only be declared and applied in separate stratification. The alternative is my suggestions for eager detection of overlapping instances and my recent idea for subtypeclasses, which afaics brings as close as we can get to full flexibility of modules without breaking the abstract algebra. Otherwise, note that typeclasses insure that for each data type there can only exist one implementation (per typeclass) simultaneously. But modules without some stratification of Modular Type Classes let you break that assumption and multiple implementations for a signature and a specific data type can exist simultaneously. Modules can inject multiple implementations due to the values being first-class. Whereas even with my idea for multiple typeclass variants, the function hierarchy of typeclass bounds prevents an algorithm from employing incompatible instances, but modules don’t enforce that invariant because the values are first-class not just on function bounds.


@keean wrote in private:

There is no difference between this and modules, the caller can choose which compare is used by sort

The same algorithm within a function hierarchy can not use different implementations with typeclasses. But modules break that invariant, because they’re first-class values that can be passed as arguments to functions. Even two arguments with the same signature on the same function could receive two different implementations for the same data type, which can not be the case with my idea for typeclasses.

Actually that depends on something you haven't said. Whether you allow multiple type-classes of the same typeclass-class to be required by a function.

I guess I was thinking it was clear by implication that I would not allow breaking the abstract algebra. Afaics, with modules we have no way to prevent it.


So the decision of associated types vs. abstract types seems to really revolve around whether we want an abstract algebra or not? But apparently Backpack has added some modularity orthogonal to typeclasses and I’m curious about what benefits it adds and how it interopts with typeclasses? If anyone could help summarize and decipher the Backpack system I would appreciate it.

I do think it would be nice in addition to typeclasses to also have parametrised module scopes for sharing the parametrisation of a group of functions, so they can be modularized as a group scope instead of at every call site. And I think I would like this to be (or at least option to be) applicative so that order of imports doesn’t matter. But I’m thinking it would be sufficient to parametrise them from the top-down with input parameters, instead of abstract types. The complaint against this is we can’t encapsulate the type. But if we want to insure an abstract algebra then we can’t allow encapsulation of the type, because we need to insure from the top-down that there are no overlapping instances that would break soundness (either of the abstract algebra or of the associated types although the later would remain sound in an abstract type paradigm).

  • Type classes work best when it is clear what methods they should support. However, for many interfaces, it is not obvious what set of methods should be put into an interface; for example, there are many functions which an interface for strings might support—which ones go in the type class? It is inconvenient to define a new type class (and new instances) just to add a few more methods, and thus this is rarely done in practice.

I thought typeclasses can be composed of other typeclasses? So to add methods you just create a new typeclass and inherit the methods (not the same as my subtypeclasses idea) from the preexisting one. This is mixin modularity. What is the problem with that? Why it is an more inconvenient than defining a new module signature and do mixin implementation? Seems to be precisely equivalent.

But if we have any form of modules in addition to typeclasses, then we need mixin capability for modules also. If modules aren’t first-class we can simplify mixins as Backpack does.

  • From a performance perspective, code which uses a type class is compiled without knowledge of how the type class’s methods are implemented. This can be quite costly in a language like Haskell, where inlining definitions is essential to achieving good performance. [20] This problem can be mitigated by specializing code to a particular type, but if this specialization occurs at use-site, the compiler ends up repeatedly reoptimizing the same function, blowing up the code size of a program.

Is he saying that the linker can’t merge duplicated specializations? And that by the programmer manually controlling reuse of specializations the programmer has to do the job that a linker should really do?

I am bewildered by this. EDIT: I think he is referring to the linker automatically merging such module reuse in Backpack, but I perhaps he think this is not (always?) possible for typeclasses specializations because the Haskell compiler is not doing global optimization across weak modules?

@shelby3 wrote:

Apology for creating misunderstanding with insufficient explanations of my idea up to now, but yours above as quoted is an incorrect summary of my original idea. Where I used the word ‘inheritance’, I only meant that for example that instances that implement a ReverseOrdering implement the methods of an Ordering.

@shelby3 wrote:

We must use an explicit requires ReverseOrdering on sortRev and requires Ordering on sort, because otherwise instances of both of the said typeclasses would implement the < method calls in those functions making instance resolution ambiguous.

Ah now captured it, I will try to exemplify:

class Ordering set
    (<)::set->set->bool
    (<)=...
class Ordering set => ReverseOrdering set
    (<)=...

sort::(Ordering set)=>set->set requires (Ordering set)  --calls inside (<) from Ordering
sortRev::(ReverseOrdering set)=>set->set requires (ReverseOrdering set) --calls inside (<) from ReverseOrdering

Good idea in my eyes. Even if you want to mix dispatching inside some function:

fun::(ReverseOrdering set)=>set->set --calls inside (<) from ReverseOrdering as explicit Upper Bound
       b1=(<) set set  --calls (<) from ReverseOrdering by convention
       b2=Ordering.(<)  set set  --calls (<) from Ordering by explicit prefixing

If you call (<) implicitly, e.g. in a repl, then I would call the most general version of (<) because there is one unique Top typeclass, i.e. the typeclass which defines the signature for (<), but no unique bottom tyepclass in difference to OOP. This is what I meant earlier. With your idea you get Named Instances over the names of subtypeclasses.

But you don't overcome "using" because:
The idea of "using" was to overcome the global uniqueness problem. What if someone define the same class as you in some other module which you import unqualified, then you have to use some form of "using"/"open"?

@sighoya wrote:

sortRev::(ReverseOrdering set)=>set->set requires (ReverseOrdering set)

No need for the extra requires because (ReverseOrdering set)=> is the equivalent of requires in Haskell:

sortRev::(ReverseOrdering set)=>set->set

Good idea in my eyes. Even if you want to mix dispatching inside some function:

fun::(ReverseOrdering set)=>set->set --calls inside (<) from ReverseOrdering as explicit Upper Bound
       b1=(<) set set  --calls (<) from ReverseOrdering by convention
       b2=Ordering.(<)  set set  --calls (<) from Ordering by explicit prefixing

Ah that Ordering.(<) is not what I was proposing. We must be prevented from explicitly overriding the typeclass bounds at the call site, because that breaks the abstract algebra. Let me quote what I wrote in a prior post as follows about you can’t have conflicting typeclasses in the same function bound because that enables breaking the abstract algebra analogous to how modules break it with multiple first-class arguments having incompatible implementations:

I wrote:

@keean wrote:

Actually that depends on something you haven't said. Whether you allow multiple type-classes of the same typeclass-class to be required by a function.

I guess I was thinking it was clear by implication that I would not allow breaking the abstract algebra. Afaics, with modules we have no way to prevent it.


@sighoya wrote:

If you call (<) implicitly, e.g. in a repl, then I would call the most general version of (<) because there is one unique Top typeclass

I’m proposing you can never call a typeclass function implicitly. You must always nominally declare the bounds. Again refer to my example upthread quoted as follows of how if you do structural name space resolution instead of nominal then you break the abstract algebra.

I wrote:

EDIT: in private @keean wanted something similar to my idea but he did not want to declare which typeclasses inherit from Ordering. But the problem then would become that if sort requires ForwardOrdering then sortRev requires ReverseOrdering could call sort. Which is broken semantics and broken modularity.


@sighoya wrote:

But you don't overcome "using" because:
The idea of "using" was to overcome the global uniqueness problem. What if someone define the same class as you in some other module which you import unqualified, then you have to use some form of "using"/"open"?

Afaics, the using proposal breaks the abstract algebra because it enables requesting incompatible instances of typeclasses within the same algorithm. I already explained why my idea doesn’t suffer that problem.

Thus afaics the using proposal solves the global uniqueness problem by destroying global uniqueness (i.e. breaking the abstract algebra), same as modules do. I should be more specific and point out that my idea relaxes global uniqueness with “class of typeclasses” but afaics maintains it within the function hierarchy that implements an algorithm, thus afaics doesn’t break the abstract algebra of any algorithm (algorithms only connect to each other with first-class values or functional composition, so if no first-class injection and function bounds are sound, then abstract algebra is sound for all algorithms). Whereas, afaics modules and using destroy global uniqueness and don’t protect algorithms from injection of incompatible instances, thus breaking the abstract algebra for algorithms. The Modular Type Classes paper addresses the problem by creating an undesirable stratification between modules and instances (yet see below). @keean is trying to show that my idea suffers some disadvantages due to the hierarchy of “classes of typeclasses” but I have yet to see a compelling specific example from him yet demonstrating his concern. Hopefully you read the edits I made to the upthread post reflecting our private discussion. The reason using breaks global uniqueness is also explained in the context of the Modular Type Classes paper in §3.1 Coherence in the presence of scoped instances on pg 4.

I had already mentioned upthread ideas for how to address the global uniqueness problem with typeclasses (as they are in Haskell or with my idea for improved modularity):

I wrote:

Can we enumerate the ways they aren’t modular so we have coherent summary here?

Afaik, the issues are:

  • Open type families ( ← blog post written by current maintainer of Backpack) are not modular. Refer back to discussion linked from the OP for where I cited the LtU discussion between @rossberg (co-creator of 1ML and MixML) and @keean on that topic. Perhaps we use abstract types instead (which are inherently open and modular) but does that mean ML functors and emulating typeclasses like in Modular Type Classes?

  • Overlapping instances with the decentralized (open) nature of modularity. However, I have proposed that any instance declared in the module where the typeclass (that it implements) was declared, or secondly any instance declared in the module where the datatype (it is implementing), orphan all other instances. Those instances are always exported so the compiler must eagerly check they don’t overlap. If we also allow instances defined in their own module, but only if not overlapping (and/or have a means to choose which of the overlapping instances is used globally such naming it explicitly in some metadata file), then we have to transitively name them or export all instances and eagerly detect overlap. Otherwise name the instance you want to use and inject it by name, but this breaks recursive inferencing and the abstract algebra. Afaics, this seems to mostly solve this issue.

There’s a few issues being discussed in my above quote the linked material:

  1. Haskell’s compiler doesn’t normally eager check for overlapping typeclass implementation instances, where ‘eager’ in this context means not waiting for an overlapping ambiguity when in the normal resolution of instance selection for typeclass bounds, but instead eagerly checking that instances of a typeclass are not overlapping. And note that per §3.2 Overlapping instances on pg 5 in Modular Type Classes, the definition of overlapping is apparently controversial within the Haskell ecosystem. The argument for needing eager checks for overlap is analogous to the one of needing to explicitly annotate type signatures on exported signatures (aka APIs), so that changes to the body of functions doesn’t propagate into changes in semantics or bugs far away in code that imports other code (even transitively via modules that import modules that import modules, etc.).

    The specification for Haskell is as explained in §1.1.1 Type classes on pg 2 of OCaml’s Modular Implicits:

    Haskell’s coherence in the presence of inferred type constraints relies on type class instances being canonical – the program contains at most one instance of a type class for each type. For example, a Haskell program can only contain at most one instance of Show Int, and attempting to define two such instances will result in a compiler error.

    But that’s not actually true for the actual implementation of the compiler because Haskell doesn’t check eagerly and instead achieves non-ambiguity at instance resolution as explained in §1.1.2 Implicits on pg 3 is also the case for Scala’s implicits:

    Unlike constraints in Haskell, Scala’s implicit parameters must always be added to a function explicitly. The need for a function to have an implicit parameter cannot be inferred from the function’s definition. Without such inference, Scala’s coherence can rely on the weaker property of non-ambiguity instead of canonicity. This means that you can define multiple implicit objects of type Showable[Int] in your program without causing an error. Instead, Scala issues an error if the resolution of an implicit parameter is ambiguous. For example, if two implicit objects of type Showable[Int] are in scope when show is applied to an Int then the compiler will report an ambiguity error.

  2. Another issue is that all typeclass implementation instances which can be used by any code and could potentially overlap need to be transitively exported from modules else we can’t be sure there won’t be unsoundness not only in the semantics due to breaking the abstract algebra but also potentially segfault unsoundness. So thus the desire to develop some rules which minimize which instances must be exported. Per the rules I somewhat summarized in the second bullet point of what I wrote as quoted above, if we default to orphaning instances which are not defined with the typeclass or datatype definition, then we are only required to export ragamuffin instances (although other instances may also be exported but aren’t required to be). Those rules make it impossible to write an overlapping instance which is not detected by the compiler. But the rules don’t provide any solution for choosing which of the overlapping instances so as to remove the error.

  3. A potential solution for choosing which of overlapping instances. Note this is a global knowledge problem, meaning that adding/removing modules to a mix can change which overlapping instances are in contention. So any given module can’t make a decision for itself, because other modules may import that former module and want a different choice. So the only solution is a global metadata file for a project which makes these choices. This is thus obviously orthogonal to a using proposal because you only do this once (and only once!) at the top-level scope. So in this way, we can say that the global uniqueness of typeclasses is anti-modular, but given if we build programs with defined boundaries of code, then we can overcome the modularity problem. However, if our programs dynamically import plugin code (compiled and linked of course), then either the program would have to be entirely recompiled when the plugin changes the choices that had been used, or the plugins would be prevented from making such choices. @ezyang argued it would be better to abandon the concept of abstract algebras:

    I have a parting remark for the ML community, as type classes make their way in from Haskell: when you do get type classes in your language, don’t make the same mistake as the Haskell community and start using them to enforce invariants in APIs. This way leads to the global uniqueness of instances, and the loss of modularity may be too steep a price to pay.

    Yet afaics my recent idea provides a way to overcome the problem of Ordering only one way. And apparently Derek Dreyer was contemplating an analogous concept. Also afaics, pyon also makes an analogous proposal to my idea wherein we can create variants of sort (or in his case set). But because he enables it via ML functors and first-class modules, then the guarantee of an abstract algebra is lost in general. Afaics my idea retains the abstract algebra. The utility/virtue of the abstract algebra is to be sure that we don’t end up with overlapping instances that the compiler did not detect, thus creating unsoundness. That unsoundness won’t segfault with modules, but it will create incorrect semantics.

Note all of the above three points about overlapping instances are orthogonal to whether we use my recent idea for “class of typeclasses” or traditional typeclasses.

Revisiting what I wrote about Modular Type Classes earlier in this post:

The Modular Type Classes paper addresses the problem by creating an undesirable stratification between modules and instances (yet see below).

Note that forcing all using declarations into the top-level scope is essentially the same as the metadata file solution mentioned in # 3 above. Yet afaics Modular Type Classes isn’t supporting (unless it could be modified) my idea for Ordering more than one way. And I don’t think we need both ML functors and typeclasses, although we may want some more limited form of modules with typeclasses. More on that in ongoing discussion to follow.


Again per my recent post which no one has replied to yet, I am trying to understand how Backpack dealt with these issues and relate that back to our discussion. I would appreciate help for anyone who wants to read the Backpack paper and try to help me understand the details. I am very overloaded with tasks.

EDIT: more discussion of the above.

keean commented

One problem with forcing the annotation of type class constraints is that the number of constraints will get quite large, and when you edit a deeply nested function you may have to add constraints all the way back up to the main program. This effectively destroys locality of editing. Imagine wanting to add some debug logging to a deeply nested mathematical function, you are going to need to touch every function definition that calls this function to add the logging-io constraint, and every function definition that calls those functions.

It also makes it nearly impossible to have library functions with callbacks because you would need to add the constraints for your callback into the library.

@keean wrote:

One problem with forcing the annotation of type class constraints is that the number of constraints will get quite large

Please note that for my idea/proposal no requires annotation is required. If there’s none it will simply default to only allowing the base (of the “class of typeclasses”) typeclass instances same as for traditional typeclasses, unless it’s overridden by being nested within a function with a narrowing requires bound.

That’s a valid point for non-exported functions except generally I like annotations because they provide documentation. And remember my proposal for putting annotations off to the right side of (Haskell puts them on the line above) the function body definition. For anonymous lambda functions and perhaps other local functions, I want inference.

Note, one change from my prior stance on this issue, is that I realize I either want decidable type inference with principal types regardless of whether I also want annotations on exported APIs, or no type inference. The reason was given where I explained that Scala’s type inference (where I was discussing HKT) is arbitrary and random (you’ll get different inferred types for seemingly inexplicably complex reasons, i.e. action-at-a-distance effects). This non-principal’ed type inference (c.f. §8 Future work on pg 40) (due I presume to implicit resolution involving first-class modules which are types, not just function bounds which are not types) is one reason I do not like OCaml’s Modular Implicits. This is also noted as a refutation of Robert Harper’s religion on modules.

I want to require explicit bounds annotated on exports for the reason I stated:

I wrote:

is analogous to the one of needing to explicitly annotate type signatures on exported signatures (aka APIs), so that changes to the body of functions doesn’t propagate into changes in semantics or bugs far away in code that imports other code (even transitively via modules that import modules that import modules, etc.).

So that when a change violates the annotated bounds, the conflict becomes a local error, not an error or even subtle bug at a distance. Action-at-a-distance is the horror of (debugging and fragile modularity) for imperative coding and for inappropriate use of type inferencing— which can possibly even contribute to the creation of a dreaded Heisenbug.


@keean wrote:

and when you edit a deeply nested function you may have to add constraints all the way back up to the main program. This effectively destroys locality of editing.

Thanks for raising this issue! No this is another example of a misunderstanding of what I’m proposing because I didn’t mention (nor pay attention to) this detail. In my proposal the typeclass bounds don’t have to annotated if they’re simply repeating a nested requires bound. The compiler can infer this and afaics there’s no negative impacts on the soundness of my idea. The sortRev requires ReverseOrdering => … is forced to annotate because it is narrowing the constraint from an Ordering to a ReverseOrdering. But if sortPlus => … is accepting an Ordering and calling sort requires Ordering => … then it’s not required to be annotated.

Note I agree we surely need explicit requires if there’s any bare typeclass method called (i.e. applied) with in the function definition along with a non-base “class of typeclass” requires bound on some function application within the body, as explained by §4.1 Inference, coherence and canonicity on pg 30 of OCaml’s Modular Implicits. Actually this case generalizes to when there’s conflicting bounds in the same function definition, so the explicit requires is needed to provide unified narrowing if possible. But should we allow diamond multiple inheritance in our subtypeclasses in order join leaves of the tree into a unified narrowing? (I’m thinking no).

It also makes it nearly impossible to have library functions with callbacks because you would need to add the constraints for your callback into the library.

I don’t think so, for the same reason above.

@shelby3
I see what you propose is good for functional composition (guarantee of abstract algebra?) but how do you handle unqualified import of typeclass instances from other modules of some typeclass defined in your module. For such things, an additional stratification is good.

The §4 Canonicity on pg 30 and §4.2 Canonicity and abstraction on pg 31 of OCaml’s Modular Implicits explains what I been saying that canonicity (aka global uniqueness) is broken by modules.

The §4.3 Canonicity as a feature on pg 30 and §4.4 An alternative to canonicity as a feature on pg 32 explain why canonicity is useful as a feature (i.e. an abstract algebra) and offer a way to emulate it with ML modules by replacing function constraints with module constraints. Yet that requires the programmer to remember to enforce the canonicity of the algebra instead of the canonicity (but not consistent semantics unless somehow we could check axioms) being enforced by the compiler. And in general for ML modules, it forsakes the elegant (non-verbose) implicit resolution (even convoluted nested typeclass bounds in implementations of typeclasses) of the compiler enforced canonicity of typeclasses. IOW, the ML module emulation of canonicity will be like the “is, in Greg Morrisett’s terminology, like steering the Queen Mary: you have to get this hulking mass pointed in the right direction so that[because] the inference mechanism [can’t] resolves things the way you want it to” criticism that Robert Harper made about typeclasses w.r.t. to their ability to so Ordering in more than one way (and afaics my new idea addresses/solves his criticism?). Which as we are learning in this thread his criticism is not so absolutely true as he would have led everyone to believe. There are more design axes factors than Robert admits. However, as explained in §1.2 Modular implicits on pg 3, OCaml’s Modular Implicits do nested implicit resolution, yet (as reiterated in §7.1 Type classes on pg 38) the function bounds can’t be inferred and must be explicitly annotated which @keean has noted is anti-modular (and my idea doesn’t suffer from). Also note that OCaml emulation of canonicity example afaics requires Ord to be specialized on abstract type t, yet Ord is an input type thus afaics breaking the vaunted encapsulation that is @rossberg claimed to be so advantageous for ML modules and their abstract types. Additionally the claim that Modular Implicits don’t have the stratification is really an obfuscation of the fact the set of available implicits in scope have to be top-down dictated if canonicity is desired, same for any other way of achieving canonicity as we have discussed in detail. As admitted in §7.1 Type classes on pg 38, Modular Implicits can’t express generic implementation instances!

@ekmett also explained why enforced canonicity is important; and he specifically mentions the conflation of concerns that I also noted. @julesjacobs reiterated the point from the first link this comment that the union method should not be unconflated from a Set because whether the implementation is via relative ordering, hashing, or tries, should be encapsulated and coherence relies on the non-modular canonicity when using typeclasses. But as @ekmett alluded to in his response, we want to late bind the implementations for different value types of the Set. ML modules would enables us to package up a Set with a given encapsulated abstract value type t and it’s encapsulated implementation, but we should be able to do the same with typeclasses combined with some Backpack-like module system for controlling what is exported except that the value type will be an input type parameter (or associated type of a type family) and not an encapsulated abstract output type. Afaics the point remains the compiler enforced canonicity and what we forsake is encapsulation of abstract types yet we gain back other advantages.

That modular emulation of canonicity is similar to what both Dreyer and pyon were proposing as I cited in my earlier post as quoted below.

I wrote:

Yet afaics my recent idea provides a way to overcome the problem of Ordering only one way. And apparently Derek Dreyer was contemplating an analogous concept. Also afaics, pyon also makes an analogous proposal to my idea wherein we can create variants of sort (or in his case set). But because he enables it via ML functors and first-class modules, then the guarantee of an abstract algebra is lost in general. Afaics my idea retains the abstract algebra. The utility/virtue of the abstract algebra is to be sure that we don’t end up with overlapping instances that the compiler did not detect, thus creating unsoundness. That unsoundness won’t segfault with modules, but it will create incorrect semantics.

@sighoya wrote:

I see what you propose is good for functional composition (guarantee of abstract algebra?) but how do you handle unqualified import of typeclass instances from other modules of some typeclass defined in your module. For such things, an additional stratification is good.

This was discussed/explained in my prior post. Please read the 3 numbered items near the bottom of the linked post.

The stratification is effectively in the metadata file I mentioned which is analogous to the stratification in Modular Type Classes as mentioned near the bottom of that post.

keean commented

@shelby3 I'm starting to like your idea more, I like that it maintains an abstract algebra. There are questions about how type-classes relate, so for example, we want interface inheritance not implementation inheritance (I think) so if you 'overload' or 'extend' a typeclass you must provide implementations for the functions and you cannot change their type signatures. That would address some of my concerns, so:

typeclass Ordering ...

typeclass ForwardOrdering extends Ordering ...

typeclass ReverseOrdering extends Ordering ...

typeclass PurpleForwardOrdering extends ForwardOrdering ...
keean commented

@shelby3 if it is only interface inheritance not implementation inheritance then diamond inheritance is fine. You can extend multiple interfaces into the same interface, you are just saying it provides for all of them (again function types cannot be changed).

@shelby3 wrote:

So any given module can’t make a decision for itself, because other modules make import that former module and want a different choice.

I thought about the same problem, maybe on can "revert" opening of typeclasses/typeclass instances from imported modules in the current module by "closing" and then "open" the desired typeclass/typeclass instances.
Another thought on it would be to ignore open/using statements in modules which are imported and then to specify oneself what is needed.

@shelby3 wrote

So the only solution is a global metadata file for a project which makes these choices. This is thus obviously orthogonal to a using proposal because you only do this once (and only once!) at the top-level scope.

How does such a metadata file look like, is it a project file containing entries in which the default typeclass instances are sepcified for each module:
module1 : open module1.x, open module2.y...;
module2: open ... ;

Some points to think of:
1.)
A possible problem with subtypeclasses is the idomatic meaning of it. One would think that ReverseOrdering is a more specialized sort of Ordering but it is simply another Ordering.

2.)
Why has the ReverseOrdering the right to overdetermine Ordering but not vice versa:

sort set= sortRev set -- sortRev uses ReverseOrdering
sortRev set= sort set -- ReversOrdering is more specific and overdetermines Ordering, so sort uses ReverseOrdering.

What I meaning in general is when you use subtypclasses you get implicitly a direction.

keean commented

@sighoya

Why has the ReverseOrdering the right to overdetermine Ordering but not vice versa:

Because ReverseOrdering 'extends' Ordering. There is a subtyping like relationship, but typeclasses are not types, so it is not actually subtyping. This is a kind of interface inheritance so we would require implementations of 'ReverseOrdering' to implement all the functions of Ordering plus any additional ones in the ReverseOrdering typeclass.

We can allow any requirement, to be 'coersed' into any extension of that requirement, but not the other way around, and not across branches, so we can coerse 'Ordering' into 'ReverseOrdering' or 'ForwardOrdering' but we cannot coerse 'ReverseOrdering' into 'ForwardOrdering', nor 'ForwardOrdering' into 'Ordering'.

@keean wrote:

We can allow any requirement, to be ‘coersed’ into any extension of that requirement, but not the other way around, and not across branches, so we can coerse Ordering into ReverseOrdering or ForwardOrdering but we cannot coerse ReverseOrdering into ForwardOrdering, nor ForwardOrdering into Ordering.

Given the way you defined ‘coersed’, then that is correct. I normally think of ‘coersed’ with the juxtaposed meaning. I think of the narrower set of interfaces being coerced (American spelling of the word) into the larger set because the coercion is the denial of some members of the broader set from being admitted entrance. But I guess if you think about it definitionally in terms of sets, you’re correct that the broader set is being squeezed and thus coerced into the narrower one.

Yet also the sets here are not concentric nor even overlapping, but rather disjoint and depend on context. IOW the implementations that are candidates for Ordering are never candidates for ReverseOrdering or ForwardOrdering. Yet when we coerce an Ordering to a ReverseOrdering or ForwardOrdering via explicit requires nested function bound context, then none of the candidates for Ordering are allowed where an Ordering was otherwise acceptable for the nested function bound. It must be this way for attaining canonicity; otherwise we would have ambiguous instance implementation selection resolution, as we had discussed by example in private (and I think I also made a note of that in an upthread post).

Because ReverseOrdering ‘extends’ Ordering. There is a subtyping like relationship, but typeclasses are not types, so it is not actually subtyping.

That is why I have been careful to refer to the concept as subtypeclassing instead. Perhaps “nested typeclassing” is a more accurate description, because subsetting implies non-disjoint sets?

When we say typeclasses are not types, we mean that there’s no way to construct a first-class value that has the type of a typeclasses. This for example prevents typeclasses from being involved in subtyping covariance soundness invariants for subtypes (and contravariance for supertypes) of mutable containers of objects. Even for existential types, there’s no way to have the value type of a container be a union of existential types, rather it must be a single existential type.


@sighoya wrote:

  1. A possible problem with subtypeclasses is the idomatic meaning of it. One would think that ReverseOrdering is a more specialized sort of Ordering but it is simply another Ordering.

Per @keean’s and my elaboration above, ReverseOrdering is a more specialized Ordering because the coercion can only be applied one direction, which is necessitated by the need for inferencing the base without annotation of requires (which @keean pointed out makes it more modular) and because we don’t want ReverseOrdering and ForwardOrdering to ever be coerced to each other. However, you’re also correct that they are just another Ordering and don’t necessarily subset the set of objects that implement an Ordering, which is because typeclasses are not types of values (values being objects in this sentence). And which is why I don’t think the “nested typeclassing” has the same oft-maligned problems that subclassing has such as the issue of early-bound, rigid ontologies.

@keean wrote:

if it is only interface inheritance not implementation inheritance then diamond inheritance is fine. You can extend multiple interfaces into the same interface, you are just saying it provides for all of them (again function types cannot be changed).

Ah I think this breaks axioms though. How can I implement both a ReverseOrdering and a ForwardOrdering in the same implementation instance with a multiple inheritance of ForwardReverseOrdering.

The inheritance I proposed is not just structural interfaces; it is also semantic.

It’s @keean’s preference for and extensive explanations of typeclasses that caused me to dig deep, so we have to credit insights as derived from his intuition/experience, because otherwise I would have not likely gotten there alone.

So now I think I have another significant new insight!

Actually w.r.t. to generic parametrisation of the value type of containers and encapsulating the implementation details of a container such as Set, typeclasses excel at the former because they don’t require early binding (commitment) to a specific implementation detail conflated with an abstract output type.

I wrote:

As admitted in §7.1 Type classes on pg 38, Modular Implicits can’t express generic implementation instances!

@ekmett also explained why enforced canonicity is important; and he specifically mentions the conflation of concerns that I also noted. @julesjacobs argued that the union method should not be unconflated from a Set because whether the implementation is via relative ordering, hashing, or tries, should be encapsulated. But as @ekmett alluded to in his response, we want to late bind the implementations for different value types of the Set. ML modules would enables us to package up a Set with a given encapsulated abstract value type t and it’s encapsulated implementation, but we should be able to do the same with typeclasses combined with some Backpack-like module system for controlling what is exported except that the value type will be an input type parameter (or associated type of a type family) and not an encapsulated abstract output type. Afaics the point remains the compiler enforced canonicity and what we forsake is encapsulation of abstract types yet we gain back other advantages.

So the new insight is that we want to retain input type parametrisation for the value types (i.e. the data), but we want to also parametrise the choice of typeclasses the module implements!! Wow!

So in this way for strong modularity, we could declare a signature on a module that has the operations of a Set parametrised on any value type, but which is also parametrised on the typeclasses it requires for the implementation. And the easiest way to parametrise the typeclasses is to simply not list them on the signature of the module. IOW a module signature leaves the typeclasses that an instance of the module will require open. So now can say that typeclasses are more modular than ML modules. Ha!

(note afaics this is separate typechecking, not separate compilation, which is also the case for ML modules)

Duality of Modularity for ML modules vs. Typeclasses

Actually what we have here is a duality. ML modules are good at encapsulating from the bottom-up, but forsake canonicity and open top-down genericity. Whereas, typeclasses are like a dual, in that they are top-down thus forsaking open encapsulation, but gaining and requiring canonicity as well as top-down genericity. So while ML modules are open for encapsulation (which creates the dependent typing issues we discussed), typeclasses are open for top-down genericity. And I think typeclasses have a more elegant implicit story due to the canonicity.

Strong Modules + Typeclasses Proposal

My “nested class of typeclass” idea is orthogonal to the following.

Okay so tying it all together now, including my prior post and re-read my upthread post about @ezyang’s thesis about Haskell’s Backpack module system.

Afaics that in addition to typeclasses we only need modules that provide strong modularity without ML functors nor ML abstract types (distinguish ML abstract types from Haskell’s ADTs aka abstract data types). Strong modularity means a module is not limited to only import’ing (aka include’ing) module implementations (as weak modularity does); it can also require its future importer to supply a module implementation which matches a specific interface signature.

Also both features should support recursive linking1 wherein a module that imports or requires the interface of another module, may be reciprocally imported or required (even at some nesting or imports or signatures) by other said module. Note the double-vision problem is really not an issue because theirs no translucent sealing due to abstract types as explained for Backpack in §4. Related Work and Technical Discussion. Apparently the complexity of the description of how Backpack deals with such issues is complicated by the fact that it’s elaborating to Haskell’s weak modules. If we instead implemented the features natively in a compiler, I think the specification would be simpler to describe (and simpler to prove sound).

Contrary to Backpack’s §5. Future Work, I don’t think we need to name typeclass implementation instances that are in interfaces and instead specify which instances are exported by only the instance B.Eq A.T declaration (from this it can be determined if the instance is implementing a typeclass Eq and/or type T in the same module as itself as this pertains to upthread mentioned rules on required exporting of ragamuffin instances and thus ragamuffin instances have to be exported). The top-down metadata file for resolving conflicting overlaps (i.e. violations of canonicity) is an orthogonal concern to modules. Per my prior post, only typeclasses (not their implementation instances) can be explicitly exported and thus appear in interfaces, not typeclass bounds on functions.

So I'm thinking module interfaces are named. The interface inherently enumerates what is exported by a module. Modules can import other modules and modules can list “interface type” parameters which are quantified by an interface (or union of interfaces). The importer of a module must supply an implementation or one of its own compatible type parameters for every type parameter of the module being imported.

Backpack is similar to what I am proposing. I have noted the differences that came to mind above. Another difference is Backpack doesn’t have input “interface type” parameters for the signatures (aka “holes”) a module requires. Instead it assigns these to output “interface type” names. And assigning an implementation is by assigning to the same name. The advantage is these (analogous to abstract types versus type parameters) don’t need to be assigned to on every import and only on the include which resolves them to an implementation. What I don’t yet grok is why there’s a separate concept of a package and a module necessitating the import Socket? Does that have something to do with packaging several modules in the same package for package managers like Hackage and Cabal?

W.r.t. the versioning packages, the automatic semantic versioning of Elm Packages is an interesting idea.

Have I missed anything?

I hope we are almost finished with this decision. I really need to wrap this PL design process up and move forward.

1 The argument for allowing such cyclic dependencies is that separation-of-concerns requires them, and the argument against allowing cyclic dependencies is they can make slower the compile-time of the interactive development pattern of edit-then-test.