In this project I implement an interpreter for a ML dialect SimPL.
The implementation is pretty straightforward given the specification and the skeleton, but there are some points that are worth mentioning.
For type checker I refers to this blog for most of my implementation, which demonstrates the development of Rémy's algorithm for OCaml type inference.
The T-LetPoly rule is simple but rather inefficient for implementation, so explicit polymorphic types should be introduced, which brings the procedure of type generalization and instantiation.
The instantiation is just replacing all type quantifiers with fresh type variables, but the generalization should be taken care of since type variables still in the context are not supposed to be generalized since generalization here will break the constraints.
The naive solution is scanning the environment, but Rémy came up with a more efficient way introducing generalization levels, relaxing environment traversing to integer comparison. The original idea comes from the similarity between memory management and type generalization (ownership of memory region/type variable).
Actually we are going to implement a special case of type class, supporting ad hoc polymorphism for equality check.
A common technique to implement type classes is dictionary passing style, where each type records what classes it belongs to.
But the original definition with equality type have a bit problem. The T-NIL rule:
According to that rule, the type of fold
and map
.
So I relax this constraint, requiring it to be equality type only when it is compared to other expressions. It is also the usual way how type classes are implemented.
I also implement some extra features which are common in functional programming languages.
The classical sum type.
Syntax:
Type definition:
Typing rules:
Value definition:
Evaluation rules:
Example:
(* new.case.spl *)
let f = fn b => fn x => fn y =>
if b then inl x else inr y
in
(f true () 1, f false () 1)
end
(* : (unit + int) * (unit + int)
==> pair@inl@unit@inr@1 *)
After relaxing the type constraint on
Syntax:
Typing rule:
Evaluation rules:
Example:
(* new.fold.spl *)
let foldl = rec foldl =>
fn f => fn x => fn l =>
case l of
nil => x
| h :: t => foldl f (f x h) t
in
foldl (fn x => fn y => x + y) 0 (1::2::3::4::nil)
end
(* : int
==> 10 *)
Mutual recursion in ML dialects usually have two forms. The first form is mutual recursion bindings, like let rec ... and ...
in OCaml. The second form is in the original Y combinator style where a recursion is an expression rather than a binding. I see that SimPL prefers the latter so I implement mutual recursion in that form.
For syntax, I refer to similar terms in Coq like fix <ident> := ... with <ident> := ... for <ident>
, where multiple values are defined and one of them is returned. The other values should be considered and designed as helpers since they can not be accessed outside, which is also a limitation of mutual recursion in the second form.
The original recursion syntax is modified to:
I sacrifices the for <ident>
part for a uniform syntax and make the parser generator happy. The first value is returned. But in the following part I will use
The typing rule is modified to:
Value definition:
Here
Evaluation rules:
Example:
(* new.mutualrec.spl *)
let iseven =
rec e => fn x => if iszero x then true else o (x-1)
with o => fn x => if iszero (pred x) then true else e (x-1)
in
iseven 6
end
(* : bool
==> true *)
I implement a classical mark-and-sweep garbage collector for reference cells.
public int allocate(Env e, Value v) {
if (mem.size() >= gcThreshold) {
gc(e);
gcThreshold = 2 * mem.size();
}
int p = pointerGen;
pointerGen += 1;
mem.put(p, v);
return p;
}
Each time the interpreter allocates memory cell for value, if the current active cells exceeds the threshold then the garbage collector will scan the environment and collect the unreachable cells. The next stage threshold will be the double of number of active cells after garbage collection.
Make arbitrary part of the expression can be lazily evaluated is a little complex. Implementing it in a naive way have some disadvantages:
- Explicit forces across the interpreter, which will result in a verbose code.
- Performance penalty. Forced thunks is hard to be shared when arbitrary part of the expression can be lazily evaluated, resulting in multiple evaluations of the same expression.
One way to solve the above problems is using graph reduction technique. But it requires a complete refactor of the interpreter architecture. For simplicity without a great loss of performance and functionality, lazy bindings and functions evaluating arguments lazily (which is actually a lazy binding) can be introduced.
The syntax:
The typing rules are similar to those of
Value definition:
Evaluation rules:
In the last rule
In real implementation, while the environments originally behaves like immutable linked lists, it is actually a persistent data structure where two environments share the most common parts of name-value mapping, which enables me to implement local environment updating easily without changing previous code.
Example1: the lazy binding will not be evaluated if not accessed.
(* new.lazy1.spl *)
let const = fn x => lazyfn y => x in
lazy a = hd nil in
const () (hd nil)
end
end
(* : unit
==> unit *)
Example2: the lazy binding will be evaluated at most once.
(* new.lazy2.spl *)
let x = ref 0 in
lazy y = (x := !x + 1) in
y;y;!x
end
end
(* : int
==> 1 *)
Sieve of Eratosthenes:
(* new.primes.spl *)
let filter = rec filter =>
fn f => fn l =>
case l of
nil => nil
| h :: t =>
let rest = filter f t in
if (f h) then (h :: rest) else rest
end
in
let fromTo = rec fromTo =>
fn m => fn n =>
if (m > n) then nil else (m :: (fromTo (succ m) n))
in
let sieve = rec sieve =>
fn l =>
case l of
nil => nil
| p :: rest =>
p :: (sieve (filter (fn n => n % p <> 0) rest))
in
let primes = fn n =>
sieve (fromTo 2 n)
in
primes 1000
end end end end
(* : int list
==> list@168 *)
(* prime list in [2,1000] *)
N-queens:
(* new.8queen.spl *)
let foldr = rec foldr =>
fn f => fn x => fn l =>
case l of
nil => x
| h :: t => f h (foldr f x t)
in
let map = rec map =>
fn f => fn l =>
case l of
nil => nil
| h :: t => (f h) :: (map f t)
in
let filter = rec filter =>
fn f => fn l =>
case l of
nil => nil
| h :: t =>
let rest = filter f t in
if (f h) then (h :: rest) else rest
end
in
let append = rec append =>
fn l1 => fn l2 =>
case l1 of
nil => l2
| h :: t => h :: (append t l2)
in
let concat = foldr append nil in
let concatMap = fn f => fn l => concat (map f l) in
let fromTo = rec fromTo =>
fn m => fn n =>
if (m > n) then nil else (m :: (fromTo (succ m) n))
in
let queens = fn n =>
let loop =
rec loop => fn boards => fn counter =>
if (counter = n)
then boards
else (loop (concatMap expand boards) (succ counter))
with expand => fn board =>
map (fn x => x :: board) (filter (fn x => safe x board 1) (fromTo 1 n))
with safe => fn x => fn l => fn n =>
case l of
nil => true
| c :: y => (x <> c) andalso (x <> (c + n)) andalso (x <> (c - n))
andalso (safe x y (succ n))
in
loop (nil::nil) 0
end
in
queens 8
end end end end end end end end
(* : int list list
==> list@92 *)
(* all solutions for 8-queen problem
92 solutions
each is a int list of positions at each row
*)