This project aims to define a programming language (XLSL), and a compiler which converts XLSL source code into Excel formulae (with no VBA).
This is an example of a simple Excel ray tracer written in XLSL.
The standard data types in XLSL are:
number
text
logical
These correspond to built-in data types in Excel
Functions are defined in a C-like way.
number addTen(number x) {
return x + 10;
}
Functions can also be recursive using the self
keyword
number factorial(number n) {
return n <= 1 ? 1 : n * self(n-1);
}
Custom data types can be defined using the struct
keyword. A struct consists of several member declarations.
struct Vector3 {
number x;
number y;
number z;
}
Structs can be instantiated as follows:
Vector3 position = Vector3 { 1, 7, 2 };
The arguments to a struct instantiation can be literals, function calls, or any other expression whose type is the same as that of the corresponding member of the struct;
Unfortunately, XLSL does not yet natively support functions which return instances of a struct (although passing an instance of a struct as an argument to a function works fine). Instead, structs can define serialize
and deserialize
methods which convert to and from text
as follows:
struct Vector3 {
number x;
number y;
number z;
serialize {
return Std.Typing.numberToText(x) + "," + Std.Typing.numberToText(y) + "," + Std.Typing.numberToText(x);
}
deserialize {
number x = Std.Typing.textToNumber(Std.Split.head(serialized, ","));
number y = Std.Typing.textToNumber(Std.Split.nth(serialized, ",", 1));
number z = Std.Typing.textToNumber(Std.Split.nth(serialized, ",", 2));
return Vector3 { x, y, z };
}
}
Note that in the serialize
method, the fields of the struct are referred to by their names, and in the deserialize
method, the text
to deserialized can be referenced via the implicitly defined serialized
variable.
These methods can be used as follows:
text scale(Vector3 vector, number scale) {
Vector3 result = Vector3 {
scale*vector.x,
scale*vector.y,
scale*vector.z
};
return serialize(result);
}
void main() {
Vector3 p = Vector3 { 1, 2, 3 };
Vector3 q = deserialize<Vector3>(scale(p, 2));
}
Namespaces can be defined using the namespace
keyword. Functions and structs can be defined within a namespace. Expressions which try to access these structs and functions from outside the namespace must include the namespace identifier. An example is:
namespace Math {
number sqrt(number x) {
// Something here
}
}
number y = Math.sqrt(5);
Namespaces can be nested. An example is shown below of a nested namespace with a function foo
, and how you would reference foo
from different areas of the program.
namespace A {
namespace B {
void foo() {
// Something here
}
// foo();
// or B.foo();
// or A.B.foo();
}
// B.foo();
// or A.B.foo();
namespace C {
// B.foo();
// or A.B.foo();
}
namespace B {
// foo();
// or B.foo();
// or A.B.foo();
}
}
// A.B.foo();
Other XLSL scripts can be imported at the top of the program using the following syntax:
import "stdlib.xlsl" as Std;
The specified path can be relative to the current XLSL file, or can be an absolute path to a file. All of the structs and functions available in the global scope of the specified file will be placed into a namespace in the current program (in this case, the namespace would be called Std
).
XLSL does not have if
, else
, while
, or for
loops. In fact, the only conditional operation is the ternary operator. Much like in other languages, this can be used as an expression in XLSL as follows:
condition ? expressionIfTrue : expressionIfFalse
The only way to create loops in XLSL is to use the ternary operator in conjunction with recursive functions.
text
literals can be single-quoted or double-quoted strings:
text x = "Lorem ipsum";
number
literals can be any decimal number:
number x = 1.729;
logical
literals are either true
or false
:
logical x = true;
Excel has a lot of built-in functions which you wouldn't want to rewrite from scratch. These can be called like normal XLSL
functions by prepending their names with a dollar sign ($
):
number factorial(number x) {
return $FACT(x);
}
It's important to put this in its own function because the XLSL compiler cannot determine its return type, and so must rely on the return type of the enclosing XLSL funciton.
After cloning the repo, run
make
to build the compiler.
Instead of using the compiler (build/compile
) directly, it is recommended that you use the wrapper script render.py
.
This script takes an XLSL source code file and outputs a .xlsx
Excel spreadsheet file.
The formula in the specified output cells of the spreadsheet will be the compiled version of the main
function in the XLSL source code file provided. Any inputs required for the main
function will be taken from a set of cells specified from the command line.
Example usage:
./render.py ./examples/greeting.xlsl -i B1:B2 -l left D1
You can specify multiple output cells on the command line, and the same formula will be put into each of them. The behaviour of the cells can be customised by using Direct Formula Access with the $ROW()
and $COLUMN()
excel formulae. This is what makes XLSL a "shader" language — each output cell is executing the same formula, based on its coordinates it can produce a different result.
A template spreadsheet can also be specified on the command line, and the render script will use this file as a starting point. This allows you to customise the formatting of your spreadsheet in a way that won't be lost every time you recompile the XLSL code.
./render.py ./examples/raytrace.xlsl A1:CB45 -i CE47:CE60 -l left -t ./examples/templates/raytrace_template.xlsx
There are lots of improvements I'd like to make to the XLSL compiler in the future. If you feel so inclined, feel free to submit a pull request!
-
XLSL is very unoptimized. The formulae it outputs are nowhere near the simplest/shortest which would achieve the same result. In the future I'd like to improve this.
-
The lexer and parser errors are very unhelpful. Often they are just
syntax error
with no further explanation. I'd like to improve error reporting. -
There are several bugs in the lexer and parser I'd like to address:
-
Order of operations is somewhat broken. While
1+2*3
correctly gets parsed, the expression(1+2)*3
results in a syntax error. -
Nested structs access is broken. Suppose you had a struct instance
foo
with a member namedbar
which in turn was a struct instance with a member namedbaz
. The expressionfoo.bar.baz
is not handled correctly. Similarly,MyStruct x = foo.bar; x.baz
fails, but seemingly for a different reason. Further investigation is required. -
Variables cannot be used as conditions in a ternary operator. The expression
someLogicalVar ? 1 : 0
results in a syntax error. Instead, the programmer must use(someLogicalVar == true) ? 1 : 0
which isn't ideal.
-
-
There's a limit on function complexity. Due to Excel's limit of 8192 characters per formula, if an individual XLSL function compiles to a formula that is too long, while the compiler throws no warnings or errors, Excel will refuse to open the resulting spreadsheet. This is not necessarily an awful thing, as the XLSL programmer can simply split their function into multiple smaller functions, but it would be nice for the compiler to throw a warning.
-
Import file paths aren't unescaped properly. This could cause a problem when a referenced file path contains a character which must be escaped in a string literal.