#Kscope#
My personal stab at the kaleidoscope LLVM tutorial found here:
http://llvm.org/docs/tutorial/index.html
As part of the learning experience this naturally evolved a bit from the tutorial itself. Here's a rundown of what's different:
tutorial:
def foo(x y) x + y;
kscope:
def foo(x, y) x + y;
Might bring this stuff back now that the flex/bison combo is working well enough. It's really not a feature that serious languages should have but then again this isn't a serious language.
Part of the reason for all this is to (re)learn c++ and what's new in c++11, so there are a handful of commits like f4b2108 that leverage new features like
- move semantics for classes with pointer data
- for-each style loops rather than plain-old indexed ones
- unique_ptr "smart pointer" type for pointer member data
- there was a strongly-typed enum for the tokens at one point but using bison took care of that
Next up will be finding a way to use lambdas in a nice way, since they're rad as hell.
Wherever possible, classes are in their own respective .cc files with matching headers. For example, there's a header-and-implementation pair for each of the abstract syntax tree nodes under src/ast
The IR codegen parts of the system have been entirely decoupled from the AST. Each node class has a codegen method like before, but an IRRenderer class is passed in, and it's this class that provides all of the llvm-specific context and helper methods for generating LLVM IR.
At some point in the future I'll probably write separate renderers for other formats, like having the AST pretty-printed on the screen or something.
Instead of an ad-hoc parser that uses getchar() all over the place and is tightly coupled to everything else there's now a separate parsing subsystem with a lexer.ll file for token/lexer state and grammar.yy with a distilled, purestrain Backus-Naur Form grammar. The parser and lexer are abstracted into an STree (syntax tree) class that can parse an input stream and set its root node to the result of the parsing.
This simplifies the main.cpp REPL stuff immensely, std::cin is given to
the STree in chunks and the root node has it's IR rendered with an IRRenderer.
In the future it will probably be updated to abstract the REPL stuff somewhere
else and allow for file-reading depending on how kscope
is invoked.
CamelCase and pascalCase are used for class names but that's about it. Everything else adheres to an underscore style. I guess this isn't normally how c++ is done but underscore-style is superior so there.
###Requirements:
- cmake
- LLVM
- bison >= 3.0
- flex
NOTE: Installing llvm via homebrew didn't quite do it for me, doesn't seem to install the proper cmake helper stuff so you gotta install from source. It's a fairly painless procedure though:
http://llvm.org/docs/CMake.html
Source for 3.3 (the version I've been using) is here: http://llvm.org/releases/download.html#3.3 ###to build:
mkdir build && cd build
cmake ..
make