/Caramel

A modern syntax for the λ-calculus.

Primary LanguageHaskellMIT LicenseMIT

Caramel

      🍮 A modern syntax for the oldest programming language in the world. 🍮

Caramel is a set of bidirectional, Haskell-inspired syntax-sugars that are expanded to, and contracted from, λ-Calculus terms. Caramel is not a new programming language - it is a new syntax for an old language, enabling it to be written in a much saner way. The implementation aims to be simple and terse and currently stands at around 350 lines of commented Haskell code.

Example

The following Caramel (.mel) file implements QuickSort on the Lambda Calculus:

-- example_qsort.mel

qsort          = (match cons nil)
    nil        = []
    cons x xs  = (flatten [smalls,[x],bigs])
        smalls = (qsort (filter (< x) xs))
        bigs   = (qsort (filter (> x) xs))

qsort_example = (fix qsort [5,6,1,4,3,7,2])

You can run it by typing mel qsort_example on the Prelude directory. It looks for the qsort_example definition through all .mel files on the dir, expands it to a pure Lambda Calculus term, evaluates (to beta normal form), and then prints the result on the Caramel syntax:

$ cd Prelude
$ mel qsort_example
[1,2,3,4,5,6,7]

If you want to see the actual Lambda Calculus code, just append .lam to the name of the term you want to evaluate - it will output the unsugared view of the term, using De Bruijn Index:

$ mel qsort_example.lam
λλ((1 λλ(1 0)) 
  ((1 λλ(1 (1 0))) 
  ((1 λλ(1 (1 (1 0))))
  ((1 λλ(1 (1 (1 (1 0))))) 
  ((1 λλ(1 (1 (1 (1 (1 0)))))) 
  ((1 λλ(1 (1 (1 (1 (1 (1 0))))))) 
  ((1 λλ(1 (1 (1 (1 (1 (1 (1 0)))))))) 
    0 )))))))
# (This is how [1,2,3,4,5,6,7] is usually represented on the Lambda Calculus.)
# (Identation added manually.)

More examples can be found on the Prelude directory, with the "example" prefix. I suggest looking at this one first. See example_qsort.mel for more on the QuickSort example. Mind most of the remaining of files there were written before Caramel and need some refactoring to better show Caramel's syntax.

Transmogrifiers

Transmogrifiers convert Lambda Calculus programs to popular programming languages, allowing Caramel code to be used in almost any environment. To invoke a transmogrifier, use mel your_term.<ext>, where <ext> is the extension of the target language's files. This example translates the λ-calculus number 4 to different languages:

# JavaScript
$ mel 4.js
(function(a){return (function(b){return a(a(a(a(b))))})})

# Python
$ mel 4.py
(lambda a: (lambda b: a(a(a(a(b))))))

# Ruby
$ mel 4.rb
(->(a){(->(b){a.(a.(a.(a.(b))))})})

# Lua
$ mel 4.lua
(function (a) return (function (b) return a(a(a(a(b)))) end) end)

# Scheme
$ mel 4.scm
(lambda(a)(lambda(b)(a (a (a (a b))))))

# Haskell
$ mel 4.hs
(let (#) = unsafeCoerce in (\a->(\b->(a#(a#(a#(a#b)))))))

# John Tromp's Binary Lambda Calculus
$ mel 4.blc
000001100110011001100

Here is how you can call the fib definition on Prelude from JavaScript:

// Utils to convert between native Numbers and Lambda-encoded numbers.
var lambda = require("IO/lambda.js");

// Obtained using `mel fib.js`
var fib = lambda.natFn(function(a){return a((function(b){return b((function(c){return (function(d){return (function(e){return e(d)((function(f){return (function(g){return c(f)(d(f)(g))})}))})})}))}))((function(b){return b((function(c){return (function(d){return d})}))((function(c){return (function(d){return c(d)})}))}))((function(b){return (function(c){return b})}))})

// Outputs 55
console.log(fib(10))

See IO directory for more information in how to use Caramel functions inside different environments.

Featured syntax-sugars

Lambdas

Lambda Calculus's lambdas are anonymous, single argument functions. Caramel's lambda syntax allows creating anonymous functions with multiple named variables separated from the body by an arrow.

Example:

(a b c -> (a (b c)))

Expands to/from the λ-calculus as:

λλλ(2 (1 0))

(Remember numbers on the λ-calculus expressions are bruijn-indexed variables, not natural numbers.)

Application

Lambda Calculus applications substitutes bound variables of a lambda abstraction by the applied term. Caramel's application syntax uses parenthesis and allows you to omit redundant ones. Using f, x and y as placeholders:

(f x y z)

Expands to/from:

(((f x) y) z)

Let and variable assignment

Let expressions are syntax for the assignment of variables. As opposed to Haskell, the let keyword isn't used, but ;-separated assignments enclosed by curly brackets, followed by the expression on which those definitions are visible. They also allow named function definitions exactly like Haskell. They are recursive, so you don't have to care about the order you write them. Self-referential terms aren't made recursive - they instead gain an extra bound variable in order to be used with fixed-point combinators. Let expressions are expanded to lambda applications.

{double   = (mul 2); 
 square x = (mul x x); 
 (double (square 3))}

Is the same as:

((double -> ((square -> (double (square 3))) (x -> (mul x x)))) (mul 2))

Where and Layout syntax

Properly idented newlines after a term are interpreted as local definitions and nicely expanded to "let" expressions. This enables a layout syntax very similar to Haskell's where clauses, although without the where keyword.

(double (square 3)) 
    double   = (mul 2)
    square x = (mul x x)

This is the same as the let expression above. See this example for more info.

Top-level definitions

Top-level definitions are just expanded to the implicit where (and, thus, let) syntax. Example:

-- some top level variables
a = 3
b = 2

-- an example program
example_tld = (mul a b)

By calling mel example_tld, the above program is expanded as follows:

example_tld
    a = 3
    b = 2
    example_tld = (mul a b)
...
{a = 3; b = 2; example_tld = (mul a b); example_tld}
...
((a -> ((b -> ((example_tld -> example_tld) (mul a b))) 2)) 3)
...
(λ(λ(λ(λ0 ((2 λλ(1 (1 (1 0)))) λλ(1 (1 0)))) λλ(1 (1 0))) λλ(1 (1 (1 0)))) λλλ(2 (1 0)))
...
λλ(1 (1 (1 (1 (1 (1 0))))))

Which is printed as 6, the result of 3 * 2.

Natural Numbers

The simplest implementation of natural numbers on the Lambda Calculus is the church encoding. Caramell's Nat syntax is just the number literal in ASCII, which is expanded to church-encoded numbers, allowing you to input them on the lambda calculus without dealing with the encoding yourself. It is just an input method - church numbers can be converted to any format at compile time, in case you don't want them.

3

Expands to/from the Lambda Calculus as:

λλ(1 (1 (1 0)))

Lists

The simplest implementation of lists on the Lambda Calculus is, too, the church encoding. Similarly to natural numbers, Camamell's Lst syntax is the same as Haskell and allows you to input lists on the Lambda Calculus without having to deal with the church encoding. After inputing, those can always be converted to any format you like, such as the Scott Encoding. Using, a, b and c as placeholders:

[a, b, c]

Is the same as:

(cons nil -> (cons a (cons b (cons c nil))))

And expands to/from the Lambda Calculus as:

λλ(1 a (1 b (1 c 0)))

Tuples

Very similarly to natural numbers and lists, tuples have a natural implementation based on the Church encoding. Caramel's syntax for tuples is very similar to the application syntax, in that it uses parenthesis. The key difference is the presence of commas - with commas, it is a tuple, without them, it is an application. If it has only one element (e.g, (7)), is is an 1-tuple, not a redundant paren.

(a, b, c)

Is the same as:

(t -> (t a b c))

And expands to/from the Lambda Calculus as:

λ(((0 a) b) c)

Chars

Chars can be encoded as 8-tuples of booleans. For example, the char 'a', which is 97 in ASCII and 01100001 in binary, can be represented as tuple (t -> (t F T T F F F F T)). The syntax is the usual:

'a'

Which is expanded to/from the Lambda Calculus as:

λ((((((((0 F) T) T) F) F) F) F) T)

Where F = λλ0 (false) and T = λλ1 (true).

Strings

Strings are just lists of chars.

"abcd"

The program above is expanded as:

['a', 'b', 'c', 'd']
...
(cons nil -> (cons 'a' (cons 'b' (cons 'c' (cons 'd' nil)))))
...
λλ  ((T λ((((((((0 F) T) T) F) F) F) F) T))
    ((T λ((((((((0 F) T) T) F) F) F) T) F))
    ((T λ((((((((0 F) T) T) F) F) F) T) T)) 
    ((T λ((((((((0 F) T) T) F) F) T) F) F)) 
        0 ))))

Where F = λλ0 (false) and T = λλ1 (true).

Words

Words are 32-bit unsigned inters. The encoding is the same as chars, except the function of 2 arguments returns a 32-tuple instead of an 8-tuple. The syntax is a hash symbol (#) followed by a number literal. It is a minor shortcut since one could already write a lambda calculus function called # which would convert a church number to a word, and just write (# 123) instead of #123. 123 in binary is 00000000 00000000 00000000 01111011, so...

#123

Expands to/from the Lambda Calculus as:

λ((((((((((((((((((((((((((((((((0
    F) F) F) F) F) F) F) F) 
    F) F) F) F) F) F) F) F) 
    F) F) F) F) F) F) F) F) 
    F) T) T) T) T) F) T) T)

Where F = λλ0 (false) and T = λλ1 (true).

Comments

Anything between double hyphen (--) and a newline is ignored by the compiler.

-- Hello I'm a comment
a = 7 -- I'm too

ADTs

First-class Algebraic DataTypes (ADTs) are experimental and the most complex feature here. The point is that defining functions for a datatype on the λ-calculus is hard when you have to deal with church/scott encodings yourself. Combining ADTs with high-order derivers, many functions such as (on the List case) cons, nil (constructors), head, tail (getters), map, zipWith (some algorithms), as well as lenses, monads and so on, can be derived automatically. While this already works (for Scott encodings only), it is very likely that my design is imperfect and there are better solutions. Example:

In Haskell, we have:

data Bool   = True | False deriving Show
data List a = Cons { head :: a, tail :: List a } | Nil deriving Show
data Tree a = Node a (List (Tree a)) deriving Show
type TLB    = Tree (List Bool)

In Caramel, we have:

Bool   = #{True {} | False {}}
List a = #{Cons {head : a, tail : *} | Nil {}}
Tree a = #{Tree {tag : a, children : (List *)}}
TLB    = (Tree (List Bool))

Notice recursion uses the special * character, repeated n times, which works like bruijn indexed variables, refering to the nth enclosing complete ADT. Polymorphic types are just functions returning ADTs. Also, the syntax does not (as a design principle) create any top level definition. We can get many free functions for those datatypes using high-order derivers such as Ctor, Show, Match, Getter, Fold.

Bool  = #{True | False}
True  = (Ctor 0 Bool)
False = (Ctor 1 Bool) 
show  = (Show Bool)
if    = (Match Bool)
... and so on

Note this has nothing to do with types or typechecking.

Consult Prelude/derivers.mel for the available derivers.