vrtbl/passerine

Simplifying the Compilation pipeline with Typed Tagless Final Interpreters

slightknack opened this issue · 4 comments

For those who don't know what a TTFI is, there's an excellent paper here.

Because Passerine does a number of compilation passes, it has different syntax trees for each stage. This works well enough, but it means we have what I feel like is a lot of redundant overlap between various layers.

Here's my proposal: We use Rust's trait system to define a TTFI representing Passerine's syntax tree. Note that this does not mean the language is tree-walk interpreted; we're not using the evaluation strategy present at the beginning of that paper. So what does this look like?

Basically, we'll have a trait called SyntaxTree or the like that implements a TTFI for the core language:

  • Symbols
  • Data
  • Blocks
  • Assignments
  • Lambdas
  • Function Calls
  • FFI

This is all that's really needed to compile Passerine atm. As time goes on, this could be extended (i.e. with type information) or reduced (Blocks might be redundant or something).

Then, for each step up on the pipeline latter, we implement traits for each feature. As an example, the desguar step removes macros and function application from the syntax tree. This could be defined as a trait called Desugar, with the following trait functions:

  • Macros
  • Function application

With Desugar in place, we can then define DesugaredSyntaxTree to just be a supertrait of SyntaxTree and Desugar:

trait DesugaredSyntaxTree: SyntaxTree + Desugar {}

And so on, for each layer on top of the core, we implement a TTFI that extends the base, and a reduction step to lower the Syntax tree at this point. The paper goes into a lot more depth, and we'll have to figure out how each syntactic element maps to this representation, but I think it would significantly simplify the compiler implementation (and perhaps speed up the pipeline too).

I think this is a concrete extension of the ideas brought up in #20, so I'm closing that issue and would like to ask that further discussion on the topic of TTFI continue here :D

Thanks to redstone#7251 of the community Discord server for bringing this construction to my attention!

Here's what I think the base TTFI could look like:

/// Representation of the base syntax tree
pub trait ST {
    type Pattern;
    type Expression;
    
    fn symbol(unique_symbol: UniqueSymbol) -> Self::Expression;
    fn data(data: Data) -> Self::Expression;
    fn block(expressions: Vec<Self::Expression>) -> Self::Expression;
    fn assign(pattern: Self::Pattern, expression: Self::Expression) -> Self::Expression;
    fn lambda(pattern: Self::Pattern, expression: Self::Expression) -> Self::Expression;
    fn call(fun: Self::Expression, arg: Self::Expression) -> Self::Expression;
    fn ffi(name: String, expression: Self::Expression) -> Self::Expression;
}

Something interesting to think about is the representation of pattern trees. Patterns are a bit like a separate language that mirror the meaning of expressions. Do we implement TTFI for them as well?

And we need to decide how to represent Expression. Is it a big enum, like we already have? if so, do we implement a syntax tree enum for each stage of the compiler or just a single big one? Or, do we ditch syntax trees altogether and try something else entirely.

We'd also have to figure out how to turn the reduce step into a trait, which shouldn't be too complicated

This, while being cool, seems impractical for now, especially since Rust doesn't support higher kinded types. I think in the future, we might be able to set up a trait representing compilation transformations, but I'm closing this issue for now.