henrystanley/Quark

The future of Quark

Opened this issue ยท 0 comments

Wow, this is an impressive language prototype.

Are there any plans for future development and enhancements (way) beyond what's in Doc/TODO.md? Is Quark being used somewhere? Any applications written in Quark to take a look at?

I started designing a concatenative language syntax with the goal to overcome some well-known readability & understandability issues with concatenative languages (disregarding whether prefix or postfix ones) and then some day I googled declarative concatenative language just out of curiosity.

And voila - someone already raised one of the questions my endeavor tried to solve (actually basically in the same way as proposed in the reddit post). And the answer was a reference to Quarks which goes even beyond expectations with the more universal concept - pattern matching ๐Ÿ˜‰.

If there is any interest, I'd like to discuss some ideas regarding syntax etc. in Quark. Including e.g. (in random order):

  1. Reducing the number of nested brackets. Don't know now how, but maybe by some simple rule allowing omission of the outermost/other brackets if there is no ambiguity. Or maybe enclose the non-pattern matching combinators (a b | b a <combi1 combi2>) instead of the outermost brackets of the pattern match ([ a b | b a ] combi1 combi2). Thoughts?

    It also seems, that if 5 :power: 2 plus: 7 minus: 1 syntax from (2) gets implemented, but without the [] $mycombi def sugar (in other words, one would always need to write the leading : for combinators no matter what), then the pattern match brackets could be dropped completely. But I didn't test it with anything complex, so this is just a guess.

  2. Introducing syntactic sugar similar to Smalltalk's splitting of method names at : to vastly increase readability & understandability. E.g. if: ["cond" "ition" =] then: ["true" print] else: ["false" print] would internally compile to ["cond" "ition" =]["true" print]["false" print] if:then:else:. And 5 :power: 2 plus: 7 minus: 1 would compile to 5 2 7 1 :power:plus:minus:.

    This would require changing the current symbol character from : to something else. Maybe $? Thus [] $mycombi def would be a syntax sugar for [] $:mycombi def.

    Note, defining the combinator ifthenelse as if:then:else: would force one to use the if: [] then: [] else: [] sugar and would disallow its direct use as [] [] [] if:then:else: (to stay visually & mentally consistent; it'll also allow easy grepability). To avoid ambiguity when parsing, the longest match will always be taken (there could also be a no-op combinator ; as delimiter for cases where the longest match would need to be shortened; the current ; combinator could be dropped or exchanged for something else ๐Ÿ˜‰).

  3. Introducing something like (- x + [(x * 5) squared] < 0 && true && y > 5) as syntactic sugar for x -1 * x 5 * squared + 0 < true && y 5 > && (or better with shortcut evaluation, but that's an excercise for the reader). I.e. parenthesis turning all binary combinators at that nesting level to infix syntax and all unary combinators at that nesting level to prefix syntax while retaining the behavior of combinators defined with : in their label as proposed in (2) in all nesting levels; N-ary combinators where N > 2 won't be allowed in parenthesis. To change this parenthesis behavior back to default postfix behavior e.g. [] could be used. Combinators without defined precedence (using tags - see below) will also be disallowed in parenthesis.

    This would be mainly for arithmetic expressions, but not necessarily just them. This nicely complements (2) in which case e.g. (- x + square: (x * 5) < 0) would be sugar for x -1 * x 5 * square: + 0 <.

  4. Introducing syntax for compile time "type assertions" as kind of a substitute for gradual typing. I'm deliberately not talking about types as I'd prefer to "tag" data with certain constraints imposable on data e.g. through compile time assert. At best this shouldn't be a fixed set of built-in tags, but instead fully instrumentable turing complete computation in compile-time as suggested in (11).

    E.g. [ a b c | asser: (a@unsigned_int@8bit@wrap && b@unsigned_int@8bit@wrap && (a < b) && c@str@endnull@utf8) t: "wrong args" ] would have the meaning: check that a is an integer which can't represent negative numbers and will wrap around if it should get bigger than 255, that b is also an unsigned 8bit integer with wraparound, that it's possible to infer in compile time that a will always be less than b on every application of the combinator, and that c is a string which is guaranteed to be C-compatible by always having a \0 character at the end and being UTF-8 compliant. The syntax is not polished, but you get the idea - improvements welcome ๐Ÿ˜‰. Of course, there could be also groups of constraints with custom labels - e.g. myspecialstring or whatever. An empty asser:t: could list the minimum set of viable tags at that very specific place to ease development (to avoid forcing developers to remember all possible tags etc.).

    The ultimate goal of this is to have the opportunity to statically analyze the whole source in compile time. There'll probably be needed also a "dynamic" counterpart - a "cast" to any (e.g. [ a b | a@any b@any ] imported_combinator) - which would fill the gap compile time assertion has. Namely compile time assertion will error out if it can't guarantee the needed type assertions of the value on stack as requested e.g. by some imported combinator definition. This would silence the error and hope for the best in runtime ๐Ÿ˜‰.

    Btw. I borrowed this "tag" syntax idea from Rebol which uses / as separator of such tags and calls these tags "refinements" and lets the user use them in runtime. I think that / would be ambiguous or at least error prone (e.g. in the presence of / combinator), so I've changed it to @ here.

  5. Incorporate first class support for an ordered key-value heterogeneous map (currently there is only first class support for double-ended lists). Maybe [v v2 | { :k v :k2 v2 "k3" 5 }] (i.e. curly brackets) with :at: etc. combinators (maybe similar to map methods in Smalltalk)? This would also require pattern matching enhancement ("match if the keys k1, k2, k3, ... are there", "match if values v1, v2, v3, ... are there", "match if keys k1, k2, k3, ... and values v1, v2, v3, ... are there", "match if neither of keys k1, k2, k3, ... is not there", "match if neither of values v1, v2, v3, ... isn't there", "match if neither keys k1, k2, k3, ... nor values v1, v2, v3, ... are there").

  6. Change :nil to :false and completely & forever avoid 2+ value logic (it only unnecessarily complicates everything and adds no value). For cases like "pattern matching failed" (didn't match anything), see below some ideas how to handle errors & exceptional cases.

  7. Rethink the :ok (:not-ok) pattern and compare it to some simple exception-like concept similar to resumable exceptions in Dylan (i.e. an exception is just a signal, each signal has guaranteed exactly one handler at any time, the handler can decide where the handler will be executed - whether at the caller or callee site, but more importantly it can decide whether to unwind the stack up to the given place, aka termination semantics, or continue with execution at the emitting place, aka resumption semantics). See last slides of https://github.com/WebAssembly/meetings/blob/master/main/2020/presentations/2020-02-rossberg-continuations.pdf for a possible implementation.

    "Signal exceptions" have some nice advantages (mainly they don't visually pollute code with error handling like in Go/Rust, though see e.g. how V minimizes pollution by making it impossible to obtain Option type in a wrapped form), but it'd also be important to make them "visible" (both at the caller and callee side).

    Disclaimer: I admit I don't like try-except-else-finally exceptions. I like more generic concepts - even the cumbersome defer{} with recover() from Go is IMHO better than try-except-else-finally.

  8. Add first class support for "spaghetti stack" - maybe even something like CSP threads. It can be also viewed as an enhanced generic event loop mechanism (or a base building block for it) and maybe more importantly as malloc() alternative (under certain conditions - e.g. guarantee of contiguousity if requested). There are many more use cases for this concept, so that's why I'm proposing it here.

    This actually overlaps with "signal exceptions" proposal in (7) as it might be implemented as one unified concept agnostic from whether it's an exception signal or a yield event. Possible thought-through implementation might be inspired by First-Class Stacks Proposal for WebAssembly.

  9. Easy merging of lists. Easy merging of maps.

  10. Think of some nice way to have support for "default combinator arguments" (imagine named parameters with default values in Python stdlib). I myself think this could be solved by having (5) and (9) implemented (with some syntactic sugar on top if necessary). Another way could be to have a placeholder character with the meaning "from here onwards defaults should be used if not specified otherwise" and another placeholder character with the meaning "here'll be the default argument".

  11. Easier metaprogramming - probably a question of some cool combinators (like one returning a live view in the form of mutable map of all globals). We should think about compile time capabilities more (compile time assert is by far not enough). Maybe even allow "arbitrary" computation if prefixed with @ (e.g. @[ "hello world\n" print ] would be evaluated in compile time and the resulting stack "difference" (be it subtractive or additive or none) would be then saved as "patch" for the abstract syntax tree (yes, AST instead of plain string with serialized code because of safety and performance). At least some form of generic conditional compilation will be needed.

    Or even crazier - what about making pattern matching itself fully instrumentable? I.e. extensible with new "types" (objects, structs from C, whatever...). This would allow easy use of complex data. The simpliest form could be a "saved pattern matching query".

    But try to avoid this trap: edubart/nelua-lang#17 .

  12. Something like combinator "overloading". Maybe allow easy & safe extending of current combinator definitions with another pattern match of "arguments"? Some form of "monkey patching"? Simple redefinition might have some flaws (I can't think of any right now, but I'm also not confident there are none ๐Ÿ˜‰).

  13. Make pattern matching recursive (this addresses the quirk from Doc/intro.md). Maybe also disable the identity check by default and add syntax (e.g. an atom prefix !) to allow identity check for the given atom (the prefix wouldn't be "turned on" recursively though). This should play well with instrumentable pattern matching as in (11).

  14. Improve import by requiring specifying prefix to use with imported combinators (an empty string would lead to the current behavior).

  15. Introduce a simple "object model" - maybe similar to Spry (yes, tags again ๐Ÿ˜‰). No inheritance please, but only mixins (e.g. in it's simpliest form of "embedding"). This could be very easy if "instrumentable pattern matching" as in (11) gets implemented.

  16. Add "live variables" (one-to-many multiplexer). In other words memory-only fan-out mechanism (one write leads to automated update at all places reading this variable). Basically a low-level variant of publish-subscribe with guaranteed delivery (but much less verbose and much more performant).

    For inspiration see Mech or Red Reactors. This is the basic building block for modern (G)UIs (both input and output) - this gets interesting when you're multiplexing something already multiplexed ๐Ÿ˜‰ (e.g. to satisfy interactive requirements for customizable & movable toolbars).

    A good example to see it in practice: ping pong game in web browser.

    This can than be used also as "pipe" (|) in shell-like scenarios (e.g. REPL). Or it can serve the same purpose as "channel" as known e.g. from Go.

  17. Read some APIs of Rebol/Red/Spry for inspiration as those are similar languages when it comes to syntactical expressivity constraints.

  18. I'd be nice to have a convention when defining combinators, that if their name begins with C., then they'll be just spit out in plain text with C function syntax without any body (any runtime body (as opposed to compile time body) will be an error).

    E.g. [ strfmt vargs | asser: (strfmt@str@endnull && vargs@sequence) t: "wrong args" ] :C.printf def becomes printf( strfmt, vargs0, vargs1, ... );.

    Together with conditional compile time evaluation and generic (target-agnostic, i.e. not specific to C) append_to_compilation_target combinator this becomes very handy: (@[ s { :compilation_target "C" } | asser: s@str@utf8 t: "utf-8 string expected" append_to_compilation_target ] with the meaning if compilation target is C, append the string s to the current output.

    This assumes @[ ] delimits compile-time code as in (11) and silently pushes/injects a map as described in (5) with compiler-information - note this silent push/injection is a joke to overcome the missing support for pattern matching of maps - we'll need something serious, ideas welcome ๐Ÿ˜‰.

    Of course, what's missing is to "load" and use constants (be it globals or macros) from C sources and also load and use struct/array/typedef/... types. But this is IMHO less important than generating C function calls (i.e. "calling them from Quark").


If there is any interest in further development, I'm considering implementing a second compiler (actually a transpiler) with a bit different goals than this interpreter written in Haskel (primary target would probably be V which in turn has both C and web backends).

P.S. Nice posts: https://gitlab.com/henrystanley/kdt_blog_content/-/blob/master/posts/on_complexity_and_abstraction.post , https://gitlab.com/henrystanley/kdt_blog_content/-/blob/master/posts/on_composability.post , thanks!