MiniML

Joshua P. Jacob, Olavi Aikas

In this project, we consider a (very small) subset of the ML language, and consider the evaluation and the typing of programs. Taking this project should, as a side effect, help understand better how OCaml programs actually evaluate.

The base types and the structure of the project

Before starting to work, it is important that you understand well the basic data types that are supplied to you. The archive (miniml.zip, provided here) contains:

The definition of the abstract syntax tree (AST) in ast.ml;
A lexer (in lexer.mll) that allows to read tokens from an input stream;
A parser (in parser.mly) that can process a full program text, and produce an AST;
An entry point file (main.ml) that currently does not do much, but that you may enrich as you progress in the project;
A very basic makefile, which should either let you automate compilation completely (if you have make) or at least show you how to compile and link everything together (if you do not have make or prefer to use another build system).

The main thing to look at first is ast.ml, as it contains the definition of the abstract syntax of programs. All your code will use these definitions, thus, please start by reading this file carefully. Definitions are documented with comments that should be self explanatory. The main thing to keep in mind is that the programs that are considered here form a small subset of the OCaml language itself, so you should be able to map each construction to the right OCaml program.

Essentially, the MiniML fragment supports booleans, integers, and pair types. It allows to define local names (let .. in ..), non-recursive and recursive functions, and basic arithmetic operations. The access to the elements of a pair is done using built-in fst and snd functions. A program typically consists of a series of declarations of constants and functions, and a final expression that produces the value of the program.

The lexer and parser definitions use a different syntax, but you may want to look at parser.mly as it simply defines a grammar. Intuitively, the syntax of programs is close to that of OCaml programs, but does not cover it in its full generality. For instance, the definition of the function that maps an integer to its successor and its binding to a name writes down let f = fun (x: int) -> x + 1. Moreover, the examples directory contains a few examples, that should give you an idea of what programs look like. You are expected to add more examples whenever you consider it appropriate.

The main.ml file will be a starting point. You will need to fill it with calls to functions for printing, evaluation and typing.

Definition of a pretty-printer

A good way to familiarize yourself with the abstract syntax tree of a language (and with its concrete syntax too) is to write a pretty-printer, which is a function or a set of function that outputs a program. Furthermore, this pretty-printer will be used for displaying results, for error report, etc…

Read the documentation of the Printf module in the OCaml standard library; it contains a series of functions which are most useful to quickly write pretty-printers. The Printf.fprintf function should be most useful.
Create a file printer.ml for implementation and a file printer.mli for interface (you will maintain consistency between both files).
For each type in ast.ml, write a pretty-printing function for it, which inputs an out_channel, a value of that type, and performs the output. More precisely, for a type t, you are expected to write a function p_t: out_channel -> t -> unit. You should try to remain close to the concrete syntax of OCaml programs.
Call p_expr from the main function.
Test your pretty-printer on examples.

Interpreter: evaluation of programs.

We now consider the evaluation of programs.

Before we evaluate programs, we need to define some data-types. Indeed, we will need an environment to bind names to values. Furthermore, values may be either base values (integers or booleans), pairs, or functions (defined by the list of their arguments, their body, and the environment they should evaluate in). We start with these definitions:

Create files eval.ml and eval.mli. We leave it up to you to decide what to add in eval.mli
Look at the documentation of the Map module, and propose a way to construct a module that defines finite domain functions from strings to any type (hint: this definition will take a handful of lines at most).
Define types for values and for environments.
Write printing functions (along the same lines as in the previous part).

We are now ready to start the interpreter. For now, we do not consider programs with recursion.

Write functions eval_unary and eval_binary for the evaluation of unary and binary operators (we leave the choice of their types).
Create a function eval_expr that inputs an environment and an expression, and that evaluates it.
Update the main.ml file and test your interpreter. You should add more test cases than the examples that are provided.

Typing

As you have probably noticed, it is possible to write programs that fail to evaluate due to inconsistent types. OCaml would reject such programs. An example is given in the file wrong.ml.

The typing has some similarity with the evaluation. Indeed, we also need to remember what type we give to a name, thus we still need to define a notion of typing environment.

Create files type.ml and type.mli.
Define the type of typing environments.

At first, we assume that programs do not contain recursive functions.

Write a function type_unary that takes a unary operator and a type and produces either the return type or an exception if the application of the operator to a value of that type is impossible.
Write a function type_binary in a similar way.
Write the main typing function type_expr. This function should input a typing environment and an expression. It should either return its type (when the expression can be typed) or raise an exception when there is a type error. You may need to write the typing rules on paper before implementing them.
Call the typer from the main function, and check that it rejects all the programs that you found to cause the evaluation to crash due to type problems.

We now consider recursion. This part is more difficult and comes as a last question, thus we do not provide very precise guidelines. Roughly speaking, recursion brings two issues to typing. First, we need to manage the typing environment differently to be consistent with the existence of recursive calls. Second, the type of the result of a recursive function may not be checkable immediately. As an example, consider let rec f = (x: int) -> if x < 0 then -x else f (-x).

Propose solutions to these two issues.
Fix your typer and test it.