tlc-pack/relax

[Discuss][Types][UX] Strong typing vs weak typing

slyubomirsky opened this issue · 3 comments

Per the discussion in the August 16 community meeting, there are some unresolved choices we may make about type-checking behavior in Relax and what interface it should present to users.

Namely, Relax permits some subtyping: Object is the parent type to all types, Tensor without a rank is a parent type to other tensor types, etc. How should we handle cases where the provided type is more general than the expected type?

Example 1: Suppose f has the annotated return type Object.

y: Tensor((m, n), dtype) = f(x)

Example 2: Consider the below functions with type signatures and the call.

def g() -> Object: …
def f(x: Tensor((m, n), dtype)): …

y = f(g())

We might consider two ways (feel free to propose others) for the compiler to respond to these situations:

  1. The compiler could insert dynamic checks of the type of the returned value and its shape, thus converting the types implicitly. In example 1, the compiler would insert checks that the value returned by f is a tensor, since Object is more general. Similarly, in example 2, f expects a Tensor type argument and so the compiler will insert dynamic conversions around the value returned by g.
  2. The compiler would reject these examples and ask that the user explicitly cast the values.

We did not reach a strong consensus at the meeting (slim majority in favor of strong typing in a straw poll) because there are several pros and cons to the approaches.

Implicitly inserting dynamic checks (weak typing):

  • Pros: Flexible, concise, more like using Python
  • Cons: Completely silent (users may not know whether they've made a mistake until run time), increased compiler complexity

Requiring explicit casts (strong typing):

  • Pros: Simple, easy to reason about, alerts users to potential mistakes
  • Cons: More verbose, could annoy users, deviates from Python

In principle, we do not need to treat the assignment and function call examples the same way, though I would argue that a uniform approach would be preferable, as it would make the language easier to reason about.

Note that if we were to adopt strong typing, we would have to add a Cast expression to the language, though match_shape is analogous for shapes.

(Left my personal opinion out of the top post so as not to abuse the bully pulpit.)

Personally, I would favor strong typing because it is more straightforward and provides users with feedback that could help them correct mistakes. We also already deviate from Python in a few ways by requiring argument types to be annotated and also require explicit match_shape bindings for reasoning about tensor shapes, so requiring explicit casts would not be very different in principle.

Can we clarify, by the way, what the rule for shape-checking is so we can be consistent? If you have an assignment like x: Tensor((m, n), dtype) = some_expr, does the compiler give an error and tell you that you must use match_shape if it cannot conclude that some_expr has the same (m, n)? When is match_shape mandatory?

Closing because the discussion seems to have been settled and the draft spec reflects strong typing.