jfecher/ante

Tuples

skaller opened this issue · 4 comments

Real tuples seem useful and cannot easily be emulated by pairs. The problem is that with a tuple a,b,c is not equal to a,(b,c) but this is the case with pairs. This can be fixed with a unit value such as (), but then to get a tuple you must add it to the end every time: a,b,c,() and you also get a distinct one value tuple a,().

The main disadvantage of real tuples is that analyses require iteration operators, as opposed to list = pairs+unit form, which can be done with recursion. In my system I use something more complicated: a second, alternative head + tuple form, where tuple must be a pair at least, since I identify a one argument tuple with the argument (which is standard mathematics but less common in programming languages). This means polymorphic recursion over tuples is terminated by a pair rather than a unit.

Anyhow, just having pairs has it own properties, they just might be surprising, if you thought they worked like tuples (because they don').

The problem is that with a tuple a,b,c is not equal to a,(b,c) but this is the case with pairs

This seems to be the crux of your reasoning for why pairs would be less useful than tuples. To me though, it doesn't seem like an important property worth preserving since I cannot think of a situation where I've ever relied on this property of tuples. In fact, having these two be equivalent can lead to nicer syntax in some cases, removing the need for extra parentheses.

It seems you're aware that tuples can be emulated to some degree by pairs, and that this encoding can be either terminated with a unit type equivalent, or not. Ante uses the no unit type terminator approach currently, which as you note can have some surprising behavior if one isn't aware of it. Namely retrieving the nth element of a pair is different than retrieving the last element. I considered using the unit type encoding to prevent this, but it has its own issues and there is some interesting cases using just pairs allows, as shown on the website.

I may be missing something about your "head + tuple" form terminated with a pair. To me, this sounds equivalent to the pairs that already exist in ante. Where you have a (head, tuple) where tuple is either more nested pairs or terminates with a single non-pair element.

Just to clarify, my head + tuple form is an alternative representation of a tuple with at least 3 elements:
1,,(2,3) == (1,2,3)
where the double comma acts like a list cons operator. The form 1,,2 is an error. The best way to explain this is with some code: given a typeclass Eq[T] with equality operator == I define tuple equality, where * is a non-associative n-ary tuple type, and ** is a binary right associative tuple type where the RHS must be a tuple with at least 2 components:

// Class Eq: Equality
instance [T,U with Eq[T], Eq[U]] Eq[T ** U] {
  fun == : (T ** U) * (T ** U) -> bool =
  | (ah ,, at) , (bh ,, bt) => ah == bh and at == bt;
  ;
}

instance[t,u with Eq[t],Eq[u]] Eq[t*u] {
  fun == : (t * u) * (t * u) -> bool =
  | (x1,y1),(x2,y2) => x1==x2 and y1 == y2
  ;
}

instance[t with Eq[t]] Eq[t*t] {
  fun == : (t * t) * (t * t) -> bool =
  | (x1,y1),(x2,y2) => x1==x2 and y1 == y2
  ;
}

The first case handles tuples of 3 or more components. The second case handles tuples of 2 components of different types. There is no need for the case of 1 component, since that is identical to the component (the projection is the identity), the case for the unit tuple ()==() gets handled somewhere.

The reason for the third case is that in Felix, a tuple with all components the same type is an array which has a distinct normal form:
base ^ index
so that for example int * int is the same type as int ^ 2 and this case is more specialised then the two component case and also more specialised than array length 2 equality operator, which uses a loop and so isn't very efficient for arrays length 2.

Note it follows arrays length 1 are identical to the component so int^1 is just int, and int^0 is 1, the canonical unit type (since, arrays are just tuples). The index type in the array is not an integer, it can be any compact linear type. A compact linear type is 0 aka void, 1 aka unit, any sum of n units written n, and any compact sum or compact product of a compact type. For example int ^ (2 \* 3) is a matrix.

In general, no matter what form you use for a type (or value), it is reduced to a specific normal form during monomorphisation. So the code above uses polymorphic recursion elaborated during typeclass monomorphisation. The unification engine has to be able to unify both forms, that is v1 * v2 unifies with c ^ v3 by setting v1 and v2 to c, and v3 to 2.

Compared to

impl Eq (a, b) given Eq a, Eq b with
    (==) (a1, a2) (b1, b2) = a1 == b1 and a2 == b2

This seems more complex for little gain

Obviously, as a designer, you have to make tradeoffs and choices. And with a new language, you can easily change stuff around because you haven't locked in a design, and have no users. Felix has no users but it is almost 30 years old.

The point of showing you my code was simply to show that with n-ary products, you do not necessarily lose the ability to recursively define reducing operations like equality or conversion to a string. In addition, you get arrays for free. With only pairs, you probably need a separate data structure for arrays. OTOH, whilst my type system can enforce a limited number of constraints on array lengths, your planned refinement types are much more powerful. Cool!