/tlp

Experimenting with making a programming language

Primary LanguagePython

Tulip

This is a compiled, statically typed, and stack based toy language inspired by Porth. The goal of this project is for me to explore the basics of compiler and language development.

Ultimately, I'm aiming to make the compiler self-hosted.

NOTE: THIS LANGUAGE IS NO LONGER UNDER ACTIVE DEVELOPMENT

  • See the about.md for a breakdown about the language and why I'm leaving it here.

Examples

Below are a few examples to demonstrate some of the basic concepts of the language.

Hello world:

use "std.tlp"

"Hello World\n" puts

Print numbers 0 to 99:

0 while dup 100 < do
    dup putu
    1 +
end drop

Quick Start

Before the compiler is self-hosted, the Python3 compiler can be used.

Requirements:

  • Python 3.7+
  • NASM v 2.13+

Compiling and running a Tulip program

Compile the fib.tlp example program. This will generate an executable file output.

pyton3 tulip.py examples/fib.tlp

Run the program with

./output

Running the tests.

I've got a number of tests (there's still a bunch missing) to make sure that the compiler's working properly. The ci.py program runs each of the tests and make sure the output matches the expected value.

python3 ci.py

Language Overview

This is a brief overview of the features currently in the language. I'll try to keep this up to date as new features are introduced.

Literals

Integers

A sequence of digits are treated as an unsigned integer, and pushed onto the stack.

10 20 +

Booleans

true and false are parsed as booleans and are represented with 1 and 0 respectively.

Booleans are treated separately from integers thus the following code would not compile:

1 true +

Strings

A string must be contained within two ". A string is a structure within Tulip that has both a size (int) and a pointer to the data (ptr)

// This is the internal representation of a Str
struct Str
    int // size
    ptr // data
end

When a string token is encountered, the Str structure is pushed onto the stack. As will be discussed later, structures can be treated as a single element.

When a string literal is compiled, a null terminator is placed at the end for convenience for working with the operating system. String operations do not rely on this null terminator, and rather use the size of the string. The size of the string does not include the null terminator.

Intrinsics

These are the built in operations for the language.

Stack Manipulation

Operation Signature Description
dup T -> T T Duplicates the top element on the stack.
swap A B -> B A Swaps the order of the top two elements
drop T -> Consumes the top element from the stack
putu int -> Consumes and prints the top integer on the stack
push T -> R[T] Consumes the top element of the stack, and pushes it onto the return stack.
pop R[T] -> T Consumes the top element of the return stack and pushes it onto the stack.

Comparison Operators

Not all comparison operators have been implemented yet.

Operation Signature Description
== a: int b: int -> bool Pushes a == b onto the stack
<= a: int b: int -> bool Pushes a <= b onto the stack
< a: int b: int -> bool Pushes a < b onto the stack
> a: int b:int -> bool Pushes a > b onto the stack

Syscalls

Operation Signature Description
syscall<n> T1, T2, ... Tn id: int -> int Performs the syscall with the corresponding id, with up to n arguments. 0 <= n <= 6

Group/Struct Operations

Tulip supports the creation of structures as well as anonymous structures or groups.

Operation Signature Description
<n> group T1, T2, ... TN -> Group<n> Groups the top n elements into one element
group.<n> Group<n> -> T Consumes the group and pushes the nth element onto the stack
cast(<name>) T1, T2, ... TN -> struct Groups the top elements of the stack into a struct
<name>.<n> struct -> T Consumes the struct and pushes the nth element onto the stack
split struct -> T1, T2, ... TN Breaks the struct/group into it's constituent parts

Structs and groups are treated as if they were a single element. For example the swap operation will swap the entire struct/group with the element below the struct/group, while preserving the order of elements within the struct/group.

For example the following program would print 1, then 3, then finally 2.

1 2 3   // Stack: 1 2 3 
2 group // Stack: 1 [2 3] 
swap    // Stack: [2 3] 1
putu    // Output: `1`
split   // Stack: 2 3
putu    // Output: `3`
putu    // Output: `2`

Function Pointers

Operation Signature Description
&<name> -> fn(T1, T2, ...) -> [R1, R2, ...] Pushes the pointer to function <name> onto the stack.

Control Flow

If Conditions

Type checking requires that each branch (at least two) of the if statement produces the same types onto the stack. For instance, if one branch pushes an int onto the stack, and another pushes two ints onto the stack, this will not compile.

if <condition> do
    <branch body>
else <condition> do
    <branch body>
else 
    <branch body>
end

While loops

Type checking requires that the types on the stack do not change from before the loop and after the loop. You cannot, for example, push an int onto the stack with each iteration of the loop.

while <condition> do
    <body>
end

Functions

fn <name> <Input Types> (-> <Output types>) do
    <function body>
end

// Eg. Function that takes an int and returns an int
fn foo int -> int do
    // ...
end

// Eg. Function that takes a bool and returns nothing
fn bar bool do

end

Generics Support

Tlp has some basic support for generics. By default, structs and functions have to use concrete types, and unkonwn types will be rejected. In order to make a type/function generic, prefix the definition with a with block.

Note: Functions aren't type checked until a concrete instance is created. Track the issue here.

with T
struct pair
  T T
end

with T
fn consumes_t T do
  // ...
end

Generic structs are created in the same way as a normal struct with cast(<name>).

with T
struct foo
  T
end

// creates foo<int>
1 cast(foo) 

Generic functions cannot be called directly and require a with-do block to turn the generic type into a concrete type.

with T
fn generic_put T do
    cast(int) putu
end 

1 with int do generic_put
true with bool do generic_put
3 cast(ptr) with ptr do generic_put
// This wouldn't compile
// "Hello World\n" with Str do generic_put

You can put generic types in signatures. To specify a generic struct to a type, then you need to use a with -> block.

with T
struct Foo
  T
end 

with T
fn takes_foo
  with T -> foo 
do
  // ...
end

Structs/Functions can take generic function pointers. The type is declased with another with block.

with T
struct foo
  with T &fn T -> int T end
end

with A B
fn foo 
  with A B &fn A -> B end
  with B &fn B end

Constant Expression

Tulip supports a very limited number of operations as constant expressions.

const <name> <expr> end

Reserving Memory

You can reserve fixed amounts of memory (such as for an array) with reserve blocks.

reserve <name> <int> end

Types

There are only four types in Tulip by default: int, bool, ptr, and Str.

Including From Multiple Files

You can include other files with use statements. Paths can be absolute or relative to the tulip.py compiler.

use "std.tlp"