/soup-lang

Primary LanguageHaskell

SOUP, the SOUp Processor

Soup is an experimental programming language with type-dependent, dynamic syntax. What does this mean? Well, dynamic syntax means that the syntax of the language can change on the fly, even during the parsing of a single file, and in general can be manipulated by library creators as they see fit. Sometimes this is called adaptive parsing, or parsing of adaptive grammars. You can do this any number of ways, but the problem with most of these ways is that, intuitively, all these new syntactic rules being added collide with one another and make everything very confusing to reason about. Typically all identifiers (variables) are treated the same, and any syntactic rule has to apply to all of them equally. If you want to create some functions that have Haskell-style calling syntax, like f x y, these have to apply just as well to matrices, for example, which could collide with the matrix multiplication syntax A B x that some other imported library defined. My approach to solving this is to make syntactic rules explicitly type-dependent, leveraging the fact that with adaptive parsing, we can give each identifier a different syntactic rule, allowing us to treat them differently at the parser level. Now we can have both Haskell-style functions and sensible matrix multiplication without any collisions, as long as we give functions and matrices different types.

Specifics

In Soup, a parser rule is a function that tries to match itself on a string. For example, you might have a rule that parses integer literals that looks at its input string and takes characters off of it as long as they are digits. In my implementation, I use the continuation-passing style of parsing, meaning that parser rules take a string to match on as well as a "continuation", basically the function to execute after the parser finishes. A continuation takes what's left of the string after the matching, as well as a value depending on what was matched. Our integer literal parser will convert the string of digits into an integer and pass that forward to the continuation as the parsed value. If a parser fails, it returns the empty list rather than passing a value forward to the continuation. This is all fairly standard.

More interestingly, in Soup a parser-level "type" is simply a list of parser rules. (Well really a stack of lists of parser rules because of scoping, but let's just call it a list of parser rules for now.) For example, we might have an integer type that initially contains just our rule from before that parses integer literals. Now when we go to define a new variable x, what will happen (at parse time!) is that a new variable will be named, basically a compile-time pointer, and a new rule will be prepended to this integer type that matches just the string x and forwards this pointer to the continuation.

Now the whole point of having this parser-level notion of a type is so that other higher-level rules can ask for objects that are of one type or another. For example, we might want to define some new bit of syntax that negates an integer with a minus sign. To do this, we would create a new type called negated-integer (because we want to be able to parse -x and x differently, for example if we want to introduce syntax for subtraction), and add a rule to it that first matched a minus sign in its input string, and then asked for some integer to be parsed. This last step, where our rule wants an integer, is the heart of Soup. There's a builtin function called parse that takes a type to parse (well not exactly, but let's go with it for now) and a continuation, and will go through all of the rules in that type, passing them the continuation you provided, while essentially forking execution so that any side-effects (apart from IO) from failed branches don't pollute those from successful branches. Typically the type passed to the parse function will be a pointer, so that the rule can also apply to objects of that type that haven't been defined yet (and therefore don't yet have parsers in the type list).

Examples

There's one very long example in this repository called test.soup that builds up from very basic primitives to define a few types (including a generic function type!) and some custom syntax on top of them. I'll make some more succinct examples and put them here after I add some more powerful primitives.

Installation

Soup should not be used for anything apart from fun at this point. That being said, it should run fine on any modern GHC installation, and shouldn't need any third-party packages. I usually test things in GHCi by loading Soup/Run.hs and running runFile "test.soup".