/class09

Notes and code for Class 9

Primary LanguageOCaml

Class 9

Type Systems

The compiler or interpreter of a programming language defines a mapping of the values that a program computes onto the representation of these values in the underlying hardware (e.g. how floating point numbers are mapped to sequences of bits). A type describes a set of values that share the same mapping to the low-level representation and the same operations that perform computations on those values.

More generally, one can take different view points of what types are:

  • Denotational view:

    • a type is a set T of values (e.g. the Scala type Int denotes the set of all integer values)
    • a value has type T if it belongs to the set
    • an expression has type T if it always evaluates to a value in T
  • Constructive view:

    • a type is either a built-in primitive type (int, real, bool, char, etc.) or
    • a composite type constructed from simpler types using a type-constructor (record, array, set, etc.)
  • Abstraction-based view:

    • Type is an interface consisting of a set of operations.

Types serve several purposes in programming languages:

  • Detecting Errors: an important purpose of types is for detecting run-time errors related to the usage of operations on values they are not meant to be used on. For example, consider the following OCaml program:

    let f x = 1 + x in
    f "hello"

    This program is rejected by the type checker of the compiler because f expects an argument of type int but we call it with the string "hello".

    If this program were not rejected by the compiler, then at the point when 1 + "hello" is executed the program would likely crash or at the very least compute a meaningless result value because + expects two integers as argument and not an integer and a string because the low-level representation of integers and strings are incompatible.

    By using static type checking, we can detect these kinds of errors already at compile-time. In general, the earlier errors are detected, the easier and cheaper it is to fix them.

  • Documentation: Type annotations are also useful when reading programs. For instance, type declarations in function headers document constraints on the inputs that the functions assumes, respectively constraints on the return value that it guarantees:

    def f(x: Int): Int = x + 1
  • Abstraction: Finally, types are often at the heart of module and class systems that help to structure large programs into smaller components. In particular, types help to abstract away implementation details of these components. In turn, this helps to enforce a disciplined programming methodology that ensures that changes to the internal implementation details of one module do not affect other modules in the program.

  • Enabling Compiler Optimization: In general, the more the compiler knows about the behavior of a program, the more opportunities it will have to optimize the code. Types provide a way for the programmer to communicate information about the behavior of a program to the compiler, which can then be exploited for optimizations (e.g. by choosing more efficient memory representations for values or custom machine level instructions for operations on certain types of values).

The type system of a programming language is a mechanism for defining types in the language and associating them with language constructs (expressions, functions, classes, modules, etc.) as well as a set of rules for answering questions such as

  • Type compatibility: where can a value of a given type be used?

  • Type checking: how do you verify that a value observed at run-time is compatible with a given type, respectively, how do you verify that an expression will only evaluate to values compatible with a given type?

  • Type inference: how do you determine the type of an expression from the types of its parts?

  • Type equivalence: when do two types represent the same set of values?

Type Checking

Type checking is the process of ensuring that a program obeys the type system's type compatibility rules. A violation of the rules is called a type error or type clash. A program that is free of type errors is considered to be well-typed.

Languages differ greatly in the way they implement type checking. These approaches can be loosely categorized into

  • strong vs. weak type systems
  • static vs. dynamic type systems

A strongly typed language does not allow expressions to be used in a way inconsistent with their types (no loopholes). A strongly typed language thus guarantees that all type errors are detected (either at compile-time or at run-time). Examples of strongly typed languages include Java, Scala, OCaml, Python, Lisp, and Scheme.

On the other hand, a weakly typed language allows many ways to bypass the type system (e.g., by using pointer arithmetic or unchecked casts). In such languages, type errors can go undetected, causing e.g. data corruption and other unpredictable behavior when the program is executed. C and C++ are poster children for weakly typed languages. Their motto is: "Trust the programmer".

The other high-level classification is to distinguish between static and dynamic type system. In a static type system, the compiler ensures that the type rules are obeyed at compile-time whereas in dynamic type systems, type checking is performed at run-time while the program executes.

Advantages of static typing over dynamic typing:

  • more efficient: code runs faster because no run-time type checks are needed; compiler has more opportunities for optimization because it has more information about what the program does.

  • better error checking: type errors are detected earlier; compiler can guarantee that no type errors will occur (if language is also strongly typed).

  • better documentation: easier to understand and maintain code.

Advantages of dynamic typing over static typing:

  • less annotation overhead: programmer needs to write less code to achieve the same task because no type declarations and annotations are needed. This can be mitigated by automated type inference in static type systems.

  • more flexibility: a static type checker may reject a program that is correct but does not follow the specific static typing discipline of the language. This can be mitigated by supporting more expressive types, which, however, comes at the cost of increasing the annotation overhead and the mental gymnastics that the programmer has to perform to get the code through the static type checker.

It is often considered easier to write code in dynamically typed languages. However, as the size and complexity of a software system increases, the advantages of dynamic typing turn into disadvantages because type errors become harder to detect and debug. Hence, statically typed languages are often preferred for large and complex projects. Also, for performance critical code, dynamic typing is often too costly (because it often involves expensive run-time type checks).

The distinction between weak/strong and static/dynamic is not always clear cut. For instance, Java is mostly statically typed but certain type checks are performed at run-time due to deliberate loopholes in the static type checker. Some so-called gradually typed languages (e.g. Dart, TypeScript) even allow the programmer to choose between static and dynamic type checking for different parts of a single program.

Similarly, the distinction between weak and strong type systems is not always completely clear and some people use slightly different definitions of what strong vs. weak typing means. For instance, the notion of type compatibility in JavaScript is so permissive that the language is often considered to be weakly typed, even though, according to the definition given above, JavaScript is strongly typed.

Type Inference

Type inference concerns the problem of statically inferring the type of an expression from the types of its parts. All static type systems perform some amount of type inference. For instance, consider the following expression

1 + 2

Here, the static type checker can infer that the result of value must be an integer value because the two operands are integer values, and hence + is interpreted as integer addition.

The situation is more complicated if variables are involved:

x + y

In many languages, the operator + is overloaded and can mean different things depending on the types of its operands. For instance, in Scala and Java if x is a string, + is interpreted as string concatenation. On the other hand, if x is a Double it is interpreted as floating point addition, etc. This is one of the reasons why Scala and Java require type annotations for the parameters of functions, as the compiler needs this information to be able to resolve which overloaded variant of an operator is meant in each case.

In languages without operator overloading, the types of expressions like x + y can still be inferred automatically even if the types of the variables x and y are not explicitly specified. For instance, if + is not overloaded and always means integer addition, then the expression can only be well-typed if both x and y are integer values. One can thus infer the types of variables from how they are used in the code. Enabling this kind of more general type inference requires a carefully designed type system. The languages in the ML family, which includes OCaml, support this feature.

Type equivalence

Type equivalence concerns the question when two types are considered to be equal (i.e. represent the same set of values). One distinguishes two basic ways of how to define type equivalence: nominal typing and structural typing:

  • Nominal typing: two types are the same only if they have the same name (each type definition introduces a new type)

  • strict: aliases (i.e. declaring a type equal to another type) are distinct

  • loose: aliases are equivalent

  • Structural typing: two types are equivalent if they have the same structure. That is, they are structurally identical when viewed as terms constructed from primitive types and type constructors.

Most languages use a mixture of nominal and structural typing. For instance, in Scala, object types use nominal typing but generic types use structural typing.

Type compatibility

Type compatibility concerns the questions when a particular value (or expression) of type t can be used in a context that expects a different type s. From a denotational view of types this corresponds to the question whether the values described by t are a subset of the values described by s. This question is closely related to the notion of subtyping in object-oriented languages.

However, more generally, type compatibility may also concern cases where the representations of the values of the two types t and s differ, but there is a natural conversion from t to s. For instance, many languages allow integer values to be used in floating point operations. E.g. consider the following Scala expression

1.3 + 1

Here, the + operator will be interpreted as an operation on Double values because at least one of the operands is a double value. Hence, both operands are expected to be of type Double. However, the second operand is 1, which is of type Int. The representations of Int and Double values differ. Nevertheless, the expression is well-typed according to Scala's typing rules: the type Int is compatible with the type Double.

To account for the different representation of Int and Double values, the compiler will generate code that performs an implicit conversion of the Int value 1 to the Double value 1.0 (in this case, the conversion can be done statically at compile-time but this is not possible in general if e.g. 1 is replaced by an Int variable). This kind of implicit conversion code is referred to as a type coercion.

Determining if and when type coercions should be applied is a major design aspect that distinguishes the type systems of different programming languages.

We will start by studying these concepts related to types and type systems in more detail in the context of OCaml and we will later revisit typing-related issues in the context of object-oriented languages such as Scala.

OCaml's Type System

The type systems of ML-style languages to which OCaml belongs provide strong static type checking while eliminating most of the overhead that is normally associated with static type systems. In particular, for the most part, the programmer does not need to provide any explicit type annotations. Instead, the type checker of the compiler can automatically infer these types by analyzing the syntactic structure of the program. To this day, the languages in the ML family occupy a unique sweet spot in the design space of type systems by providing the benefits of static typing while retaining some of the most important benefits of dynamically typed languages.

OCaml's type system is derived from the so-called Hindley-Milner type system (also Damas-Hindley-Milner) or HM for short. HM is the foundation of the type systems used by all languages in the ML-family.

Polytypes and Monotypes

One distinguishing feature of HM is that it supports polymorphic types, or polytypes for short. A type expression t is a polytype if it contains type variables. We write type variables as letters prefixed by a single quote (e.g. 'a - pronounced "alpha").

Here are some examples of polytypes:

  • 'a -> 'b: Describes all functions that can take values of some arbitrary type 'a and map them to some other arbitrary type 'b.
  • 'a -> 'a: Describes all function that take values of some arbitrary type 'a and map them to another value of that type.
  • 'a -> int: Describes all function that take a value of some arbitrary type 'a and map them to an int.
  • 'a list: Describes all homogeneous lists storing values of some arbitrary type 'a.

Type that do not contain any type variables such as int -> int, int -> bool, etc. are called monotypes or ground types.

Type Compatibility and Unification

The first question we have to answer is how different types relate. A value v of type t is compatible with type s if v can be used in a context that expects a value of type t. In OCaml, this is true whenever the type t is more general than the type s. This definition may seem counter-intuitive at first. However, it captures the idea that a type t that is more general than another type s denotes a smaller set of values than s and, hence, every value of type t is also a value of type s.

For instance, the type 'a -> 'b is more general than both 'a -> 'a and int -> bool. On the other hand, the types 'a -> int and 'a -> bool are incompatible (i.e. neither is more general than the other) because int and bool are two distinct types.

Before we precisely define how to determine whether one type is more general than another type, let's see how this notion relates to compatibility. Consider the types t = 'a -> int and s = int -> int, then t is more general than s. Now take the expression f 3 + 1. This expression expects f to be a function of type int -> int because we are calling f with value 3 which is of type int and we are using the result of the call in an addition operation which also expects operands of type int. Now, if the actual type of f is 'a -> int, then this tells us that we can call f with any value we like, including int values. Moreover, the result is always going to be an int. Thus f also has type int -> int and it is safe to use f in the expression f 3 + 1.

We can formalize this idea of when one type is more general than another type using the notion of unification. However, we first need to introduce a few technical terms to understand this in detail.

A substitution σ for a type t is a mapping of the type variables occurring in t to types. Example: a possible substitution for the function type 'a -> 'b is the mapping σ = {'a ↦ int, 'b ↦ int}. Another one is σ'={'a ↦ 'a, 'b ↦ 'b}.

Applying a substitution σ to a type t, written tσ, results in the type t where all occurrences of type variables have been replaced according to σ. For instance, if t = 'a -> 'b and σ = {'a -> int, 'b -> int}, then tσ = int -> int.

Two types s and t are said to be unifiable if there exists a substitution σ for the type variables occurring in both s and t such that sσ = tσ. The substitution σ is called a unifier for s and t. Example: the (unique) unifier for s = 'a -> int and t = string -> 'b is σ = {'a ↦ string, 'b ↦ int}. On the other hand, there exists no unifier for t = 'a -> int and s = string -> 'a because string and int are distinct types.

A type t is more general than a type s if there exists a unifier σ for t and s such that tσ = s. Example: the type 'a -> 'b is more general than the type int -> 'b. On the other hand, int -> 'b is not more general than 'a -> 'b because there exists no substitution σ such that int -> 'bσ = 'a -> 'b (remember substitutions can only replace type variables like 'a and 'b but not primitive monotypes like int.

Type Inference

One remarkable feature of OCaml's type system is the ability to algorithmically infer the type t of any expression e and at the same time check whether e is well-typed. This yields all the benefits of a strong static type system without requiring the programmer to provide explicit type annotations in the program.

The algorithm is guided by the syntax of expressions, taking advantage of the fact that the constant literals, inbuilt operators, and value constructors from which expressions are built impose constraints on the types that the values of these expressions may have. Technically, the algorithm works by deriving a system of type equality constraints from every expression and then solving this constraint system by computing a most general unifier that satisfies all the type equalities simultaneously.

At a high-level, the algorithm can be summarized as follows. Given a program expression e whose type is to be inferred, perform the following steps:

  1. Associate a fresh type variable with each subexpression occurring in e. These type variables should be thought of as placeholders for the actual types that we are going to infer.

  2. Associate a fresh type variable with each free program variable occurring in e. These type variables are placeholders for the types of those program variables.

  3. Generate a set of type equality constraints from the syntactic structure of e. These typing constraints relate the introduced type variables with each other and impose restrictions on the types that they stand for.

  4. Solve the generated typing constraints. If a solution of the constraints exists (a unifier), the expression is well-typed and we can read off the types of all subexpressions (including e itself) from the computed solution. Otherwise, if no solution exists, e has a type error.

Rather than describing this algorithm formally, we explain it through a series of examples. A complete description of the algorithm can be found here as well as in Robin Milner's original article. A good textbook-treatment can be found in Ch. 22 of Types and Programming Languages. by Benjamin Pierce.

As a first example, let us consider the following OCaml expression e:

x + f 1

We perform the steps of the algorithm described above.

Step 1: associate a type variable with every subexpression of e:

x : 'a
f : 'b
1 : 'c
f 1 : 'd
x + f 1 : 'e

Step 2: associate a type variable with every free variable occurring in e:

x: 'x f: 'f

Step 3: generate typing constraints by analyzing the syntactic structure of the expression e.

From the top-level operator + in the expression x + f 1 we can infer that the types of the subexpressions x and f 1 must both be int because + assumes integer arguments in OCaml (floating point addition is given by the operator +.). Moreover, we conclude that the whole expression e must have type int, since + again produces an integer value. Hence, we must have:

'a = int
'd = int
'e = int

Similarly, from the subexpression f 1 whose top-level operation is a function application, we can infer that the type of f must be a function type whose argument type is the type of 1 and whose result type is the type of f 1. That is, we must have:

'b = 'c -> 'd

Finally, we look at the leaf expressions x, f, and 1. For x and f we obtain the constraints relating the type variables associated with those leaf expressions with the type variables associated with the corresponding program variables:

'x = 'a
'f = 'b

The leaf expression 1 is an integer constant whose type is int. Thus we have:

'c = int

Thus altogether, we have gathered the following constraints:

'a = int
'd = int
'e = int
'b = 'c -> 'd
'c = int
'x = 'a
'f = 'b

Step 4: solve the typing constraints.

Solving the constraints amounts to computing a unifier σ (a mapping from type variables to types) such that when σ is applied to both sides of each typing constraint makes the two sides syntactically equal. If such a unifier does not exist, then this means that at least one of the typing constraints cannot be satisfied, and there must be a type error in the expression e. Otherwise, once we have computed the unifier σ, we can read off the types of all the subexpressions of e directly from σ.

The problem of finding a unifier for a set of typing constraints is called a unification problem. We can compute a unifier σ from the set of constraints using a simple iterative algorithm. The algorithm starts with a trivial candidate solution σ₀ = ∅ and then processes the equality constraints one at a time in any order, extending σ₀ to an actual unifier of the whole set of constraints. If at any point, we encounter a constraint that we cannot solve (e.g. bool = int), then we abort and report a type error.

For our example, we process the constraints in the order given above, starting with the constraint

'a = int

For each constraint, we first apply our current candidate unifier, here σ₀, to both sides of the equality and then solve the resulting new equality constraint. In our first iteration of this algorithm, σ₀ is the empty substitution, so applying σ₀ to the constraint does not change the constraint. Thus, we solve

'a = int

Solving a single type equality constraint t1 = t2 for arbitrary type expressions t1 and t2 is done like this:

Case 1. If t1 and t2 are the exact same type expressions, then there is nothing to be done. We simply proceed to the next equality constraint using our current candidate unifier.

Case 2. If one of the two types t1 and t2 is a type variable, say t1 = 'a, then we first check whether 'a occurs in t2. If it does, then there is no solution to the typing constraints and we abort with a type error. The is referred to as the "occurs check" (more on this case later). Otherwise, we extend the current candidate unifier with the mapping 'a ↦ t2 and additionally replace any occurrence of 'a on the right side of a mapping in the current candidate unifier with t2.

Case 3. If t1 or t2 are both types constructed from the same type constructor, e.g., suppose t1 is of the form t11 -> t12 and t2 is of the form t21 -> t22, then we simply replace the constraint t1 = t2 with new constraints equating the corresponding component types on both sides: t11 = t21 and t12 = t22. Then we proceed with the new set of constraints and our current candidate unifier.

Case 4. In all other cases, we must have a type mismatch. That is, t1 and t2 must be both types constructed from some type constructor and those type constructors must be different. Hence, the two types cannot be made equal no matter what unifier we apply to them. Note that we can view primitive types like int and bool as type constructors that take no arguments. So a type mismatch like int = bool is also covered by this case.

In our example, the second case applies. Hence we extend σ₀ to

σ₁ = { 'a ↦ int }

and proceed to the next constraint. The next two constraints

'd = int
'e = int

are similar to the first one and we just update the candidate unifier to:

σ₃ = { 'a ↦ int, 'd ↦ int, 'e ↦ int }

The next constraint is

'b = 'c -> 'd

we apply σ₃ on both sides and obtain

'b = 'c -> int

and then update our candidate solution to

σ₄ = { 'a ↦ int, 'd ↦ int, 'e ↦ int, 'b ↦ 'c -> int }

After processing the remaining three constraints

'c = int
'x = 'a
'f = 'b

we obtain the following solution

σ = { 'a ↦ int, 'd ↦ int, 'e ↦ int, 'b ↦ int -> int, 'c ↦ int, 'x ↦ int, 'f ↦ int -> int }

Note that if we apply σ to all the original constraints derived from e, then we observe that σ is indeed a correct solution of the constraint system:

int = int
int = int
int = int
int -> int = int -> int
int = int
int = int
int = int

Moreover, we can now read off the inferred types for all subexpressions in e directly from σ. For instance, we have σ('e) = int, indicating that the type of expression e itself is int. Similarly, the inferred types for the variables x in e is σ('x)=int, and the type of variable f is σ('f)=int -> int. Thus, we had e.g. a larger expression that defines a function g in terms of the expression e like this:

let g x f = x + f 1

then we can immediately conclude that the type of g must be

g: int -> (int -> int) -> int

Detecting Type Errors

In the previous example, the type inference succeeded and we were able to show that the expression is well-typed. Let's look at an example where this is not the case. Consider the expression

if x then x + 1 else y

We proceed as in the previous example and first introduce type variables for all subexpressions:

x: 'a
x: 'b
1: 'c
x + 1: 'd
y: 'e
if x then x + 1 else y: 'f

Note that we introduce two type variables for x one for each occurrence of x in the expression.

Next, we associate a type variable with each program variable appearing in the expression:

x: 'x
y: 'y

Then we generate the typing constraints from all the subexpressions. From the first occurrence of x we infer

'a = 'x

From x + 1 and its subexpressions we infer:

'b = 'x
'b = int
'c = int
'd = int

From y we infer

'e = 'y

and from if x then x + 1 else y we infer

'a = bool
'f = 'd
'f = 'e

The first of the last three constraints enforces that the condition being checked, here x, must have type bool. The other two constraints enforce that the two branches of the conditional have the same type, which is also the type of the entire conditional expression (since the result value of the conditional is determined by the result value of the branches).

Thus, in summary, we have the constraints:

'a = 'x
'b = 'x
'b = int
'c = int
'd = int
'a = bool
'f = 'd
'f = 'e

Now we solve the constraints. Processing the first 5 constraints is fine and produces the candidate solution

σ₅ = { 'a ↦ int, 'b ↦ int, 'x ↦ int, 'c ↦ int, 'd ↦ int }

However, when we process the next constraint

'a = bool

and apply σ₅ to both sides, we get:

int = bool

which is unsolvable. Thus at this point we abort and report a type error.

If we look back at the original expression

if x then x + 1 else y

then we note that this expression uses x both in a context that expects it to be an int value and in an context that expects it to be of type bool. Since it can't be both at the same time, we have a type error. This is exactly what the type inference deduced for us.

Inferring Polymorphic Types

One of the great features of ML-style type inference is that solving the unification problem may produce a unifier that leaves some of the type variables unconstrained. Such solutions yield polymorphic types.

As an example consider again our polymorphic function flip that takes a curried function f and produces a new function that behaves like f but takes its arguments in reverse order:

let flip f x y = f y x

To infer the types we proceed as above and solve the type inference problem for the body expression

f y x

of the function.

Start with assigning type variables to all subexpressions:

f: 'a
y: 'b
x: 'c
f y: 'd
f x y: 'e

then assign type variables to all program variables:

f: 'f
x: 'x
y: 'y

Generate the constraints: from f, y, and f we infer

'a = 'f
'b = 'y
'c = 'x

From f y we infer

'a = 'b -> 'd

and from f y x we infer

'd = 'c -> 'e

Thus, in summary we have:

'a = 'f
'b = 'y
'c = 'x
'a = 'b -> 'd
'd = 'c -> 'e

Solving the constraints yields the unifier:

σ = { 'a ↦ 'y -> 'x -> 'e, 'b ↦ 'y, 'c ↦ 'x, 'f ↦ 'y -> 'x -> 'e, 'd ↦ 'x -> 'e }

Now from the definition of flip we know that the type of flip is

flip: 'f -> 'x -> 'y -> 'e

After applying σ to this type we get:

flip: ('y -> 'x -> 'e) -> 'x -> 'y -> 'e

This type tells us that flip takes a function of type 'y -> 'x -> 'e and returns a new function of type 'x -> 'y -> 'e, which captures that the arguments in the returned function are reversed. The type is parametric in the type variables 'x, 'y, and 'e. That is, we can safely apply flip to any function that is compatible with the type 'y -> 'x -> 'e, including e.g.

int -> int -> int
int -> bool -> bool
bool -> int -> int
(int -> bool) -> int -> int
...

Also, note that since the specific names of the type variables appearing in the inferred types do not matter, we can consistently rename them once the type inference is complete. For instance, if we rename 'y ↦ 'a, x ↦ 'b, and 'e ↦ 'c, then we get the type:

flip: ('a -> 'b -> 'c) -> 'b -> 'a -> 'c

This is exactly the type that the OCaml compiler infers for flip.

Limitations of Type System

HM guarantees at compile-time that a program will not go wrong when it is executed if it passes the type checker. Here, "going wrong" means that the program may get stuck, trying to use a value in a context it was not supposed to be used (e.g. try to treat an integer value as if it were a function that can be called). We thus get very strong correctness guarantees from the compiler. However, as with any static type system, these guarantees come at a certain cost, namely that the type checker will reject some programs even though they will not fail when they are executed. That is, the type checker will reject all bad programs but also some good ones.

One crucial restriction of the type system has to do with polymorphism. Only variables that are introduced by let bindings can have polymorphic types. For instance the following Ocaml expression cannot be typed:

let g f = (f true, f 0) in
let id x = x in
g id

The problem here is that the function f in g is not introduced by a let and therefore can't have a polymorphic type itself. Only g's type can be polymorphic. That is, once g has been applied to a particular function f, the type of that f is treated monomorphically. In contrast, the following expression is OK:

let f x = x in
(f true, f 0)

Note that we here provide a specific polymorphic implementation of the function f which we bind with a let, rather than parameterizing over the implementation of f as in g above.

Due to the nature of this restriction one also refers to the form of polymorphism supported by HM as let-polymorphism. Let-polymorphism is less expressive than the kind of polymorphism supported by other static type systems, specifically generics in languages like Java, C#, and Scala. This problem also does not arise in dynamically typed languages like JavaScript. For instance, the corresponding JavaScript program will execute just fine:

function g(f) {
  return [f(true), f(0)];
}
function id(x) { return x; }
g(id);

The restriction imposed by let-polymorphism is critical for making static type inference possible. In particular, Java, C#, and Scala do not support general static type inference because their type systems are more expressive than HM. This is an example of a trade-off between static correctness guarantees, automation, and expressivity in the design of a type system where different languages make different design choices.

Another limitation of HM has to do with how the type inference algorithm infers the types of recursive functions. Specifically, this relates to the "occurs check" during the computation of the unifier in Case 2 of the algorithm to solve type equations.

To illustrate the issue, consider the following expression

let rec print_more x =
  let _ = print_endline x in
  print_more

This function can be viewed as an output channel that we can simply feed more and more values which are then just printed on standard output:

print_more "Hello" "World" "how" "are" "you"

While the function print_more should behave just fine when it executes (and e.g. a corresponding JavaScript version indeed works as expected), it is rejected by the OCaml type checker:

# let rec print_more x =
    let _ = print_endline x in
    print_more;;
line 3, characters 4-14:
Error: This expression has type string -> 'a
       but an expression was expected of type 'a
       The type variable 'a occurs inside string -> 'a

To understand how this problem arises, consider the following slightly simplified version of print_more:

let rec hungry x = hungry

If we apply our type inference algorithm, we generate the following type variables for the parameters of the recursive definition:

hungry: 'h
x: 'x

and the following typing constraints from the actual definition:

'h = 'x -> 'a
'a = 'h

After solving the first constraint we obtain the candidate solution

σ₁ = { 'h ↦ 'x -> 'a }

which when applied to the second constraint yields

'a = 'x -> 'a

At this point we cannot extend σ₁ with a mapping for 'a because the occurs check fails: 'a occurs in the type 'x -> 'a on the right-hand side of the equation.

The equation

'a = 'x -> 'a

tells us that the return type of hungry is the same type as the type of hungry itself. However, such recursive types cannot be expressed in OCaml's type system directly, for otherwise, the type inference problem would again become unsolvable. There is a workaround for this restriction: OCaml allows recursive types to be introduced explicitly. However, this workaround slightly complicates the implementation of the function hungry.

Type Inhabitation

The type inference problem takes an expression as input and returns the type that makes that expression well-typed (if such a type exists). We can also look at the inverse problem. That is, given a type, the goal is to infer an expression that has that type. This problem is referred to as the type inhabitation problem. Type inhabitation has many interesting applications (e.g. it is used by some IDEs to implement automatic code completion like IntelliSense).

As an example, consider the following algebraic data type:

type ('a, 'b) either =
  | Left of 'a
  | Right of 'b

The type ('a, 'b) either is parametric in two types: 'a and 'b. It's values take the form Left a where a is a value of type 'a and Right b where b is a value of type 'b. That is, ('a, 'b) either can be viewed as the disjoint union of the types 'a and 'b.

Now let us consider the following type signature of a purported function f:

f: ('a, 'b) either * ('a, 'c) either -> ('a, 'b * 'c) either

Our goal is to infer a definition of a function f that has this type.

We can approach this problem systematically by analyzing the structure of the type expression. First note that f is a function that takes a parameter of type ('a, 'b) either * ('a, 'b) either and returns a value of type ('a, 'b * 'c) either. Since the former is a product type, it denotes pairs of values. Hence, we can further split this parameter into two parts denoting the first and second component of the pair. That is, the basic shape of the definition of f that we seek must be of the form:

let f (x, y) = e

for some expression e such that

x: ('a, 'b) either 
y: ('a, 'c) either
e: ('a, 'b * 'c) either

Now to derive the expression e, let us analyze the shape of the values that inhabit the type ('a, 'b * 'c) either. These take the form Left a for some value a of type 'a, respectively, Right (b, c) for some values b: 'b and c: 'c. We thus seek an expression e that constructs such values from the parameters x and y of f.

Now, the required type ('a, 'b) either has values of the shape Left a1 and Left b for values a1: 'a and b: 'b. Likewise, y has values Left a2 and Right c for a2: 'a and c: 'c. To construct values of the shape Left a or Right (b, c) we just need to consider all the possible combinations in which the values of the types of x and y can occur. This can be done with pattern matching, which leads to the following definition of the function f:

let f (x, y) =
  match x, y with
  | Left a1, Left a2 -> Left a1 (** Left a2 would also work *)
  | Left a1, Right c -> Left a1
  | Right b, Left a2 -> Left a2
  | Right b, Right c -> Right (b, c)

Note that the match cases enumerate all the possible combinations of the values that x and y can take. Moreover, in each case, we return a value of type ('a, 'b * 'c) either.

The above definition can be further simplified as follows:

let f (x, y) =
  match x, y with
  | Left a, _ | _, Left a -> Left a
  | Right b, Right c -> Right (b, c)

Note that the definition of f could be inferred using only deduction based on the information that the type signatures provide. There was no guessing involved. The goal of this exercise is to train your intuition in how to use these polymorphic types.

One observation that might help understanding this problem better is to look at it from a different perspective by exploiting something that is known as Curry-Howard Correspondence. Roughly speaking, the Curry-Howard Correspondence establishes an equivalence between inferring an expression for a given type and the problem of proving the validity of a logical formula obtained from that type.

Specifically, the polymorphic types that we are dealing with in this problem can be interpreted as formulas in propositional logic where

  • Type variables like 'a and 'b correspond to proposition variables that take on the truth values true and false.

  • Function types 'a -> 'b correspond to logical implications (written 'a => 'b). Here, 'a => 'b states that if 'a is true then so is 'b.

  • Product types 'a * 'b correspond to logical conjunctions (written 'a && 'b). Here 'a && 'b is true iff both 'a and 'b is true.

  • Either types ('a, 'b) either correspond to logical disjunctions (written 'a || 'b). Here 'a || 'b is true iff one of 'a or 'b is true.

Hence, the type

('a, 'b) either * ('a, 'c) either -> ('a, 'b * 'c) either

that we considered above can be viewed as the propositional formula

('a || 'b) && ('a || 'c) => 'a || 'b && 'c

which states that if 'a || 'b is true and 'a || 'c is true then 'a is true or both 'b and 'c are true. In other words, logical conjunction distributes over logical disjunction. This formula is valid, i.e., it evaluates to true regardless of what truth values we assign to 'a, 'b, and 'c.

This property can be proven using logical deduction. To this end, we start by assuming that the left-hand side of the implication is true, which because it is a conjunction means that both ('a || 'b) and ('a || 'c) must be true. We now do case analysis on what we know about these two formulas. From the fact that ('a || 'b) is true we know that either 'a is true or 'b is 'true:

  • Case 1 ('a' is true): then it immediately follows that ('a || 'b && 'c) is also true.

  • Case 2 ('b is true): then we case split on ('a || 'c):

    • Case 2.1 ('a is true): again it immediately follows that ('a || 'b && 'c) is also true.

    • Case 2.2 ('c is true): then 'b && 'c is true and hence also ('a || 'b && 'c).

Notice the similarity between the structure of this proof and the structure of the definition of the function f. Assuming the premise ('a || 'b) && ('a || 'c) of the implication and splitting it into ('a || 'b) and ('a || 'c) corresponds to assuming that we are given the parameter of the function f and then split it into its components x and y. Likewise, the case analysis on (a || 'b) and ('a || 'c) corresponds to the case analysis in the match expression.

That is, when solving the problem of inferring a definition for the function f that has the given type, we can solve this problem in logic by proving the validity of the formula corresponding to f's purported type and then translate that proof to a corresponding OCaml definition.

g: ('a, 'b) either * ('a -> 'c) * ('b -> 'c) -> 'c

can be viewed as the propositional formula

('a || 'b) && ('a => 'c) && ('b => 'c) => 'c

which states that if

(1) 'a or 'b is true, and

(2) 'a implies 'c, and

(3) 'b implies 'c

then 'c is true

Again, this formula is valid, which can be seen as follows: because 'a or 'b is true according to (1), we can case split on the two cases:

  • Case 1 ('a is true): Then it follows from (2) that 'c is true.

  • Case 2 ('b is true). Then, it follows from (3) that 'c is true.

By constructing an OCaml term that has type 'c from values that represent the premises (1), (2), and (3), you are effectively constructing a mathematical proof as described above. In this proof, pattern matching on a value of type ('a, 'b) either corresponds to case splitting on a valid disjunction ('a || 'b). Similarly, applying a function ('a -> 'c) to an argument of type 'a corresponds to, modus ponens, i.e. the logical deduction step where the validity of the conclusion 'c of a valid implication ('a => 'c) is inferred from the validity of its premise 'a. So using the above proof, you can directly obtain the definition of the function g that has the given type.