Subjective feedback on current languge design
Opened this issue · 5 comments
-
String - Both clojure and haskell treat strings as sequence / list of characters, find that
particular feature super powerful. Lets you work with different underlying types with a
same abstractions. In fact lots of functions written for just lists are super useful for strings
to, in fact clojure takes this even further and let's you treat anything as sequence and if
value is singular then it's sequence of itself. -
Buffer - Similar as above, although clojure does not has a standard buffer construct
they just relay on java types for that. Haskell has something similar though in form of
bytestrings -
Symbols - It would be useful to have comparison with symbols say in lisp / scheme. Also I think it would be better to have just
a list of values that act asfalse
in logical operations like:false
,nil
. -
Builtin / Program form - I think leveraging pre-existing terminology and syntax would
benefit the project. Usually I believe it's being referred as "read form" and result of read is
just a list of symbols, lists, and other primitive data types. In clojure It could be also
vectors (which are basically arrays), maps, sets and keywords. It also feels like you
encoded form is very similar to lisp program not sure why not just use same syntax and
take benefit of tooling peoples familiarity etc..(sum (mul a 2) (div 2 3))
It's even lighter or maybe it just feels so because of lisp ? BTW why not just allow
symbols arithmetic chars in symbols. That's what lisps do and define special forms
likesum, mul, div
using those:(+ (* a 2) (/ 2 3))
-
blocks - Blocks are very similar to s-expressions and a thing that I miss a lot
in all the other languages. In fact every list in lisp is an s-expression and there for
evaluation of it has a value. It's super powerful IMO one of the key reasons why
macros in lisp are so powerful but simple (although not everyone agrees on this). -
I don't quite understand reasons for making scope part of function form. I think it's
confusing since translation from source to AST and back won't be obvious. Maby
you're looking for something like let special form ? -
list Lists and aliases feel like data structure mixeup lists are known for linear time
lookup in the tail position, but document here I believe assumes constant time instead.
Also not quite sure aliases are needed, why not functions ? It feels like alias is just
a function that captures list and index of item it's aliased to. I'm getting an impression
that lists here try to act as lists, arrays and maps at the same time which I'm not convinced
is such a good idea. clojure for instance treats all of them as sequences while still have
clear pros & cons per data structure. I think I prefer that approach instead.
Other
Not sure how function calls are experessed, in lisps calls are expressed by
putting fn
in the head of the list. If symbol in the head is a special form (if
,
fn
) it's just interpreted. Also note that first form may be a macro symbol. Although
at compile time read form is traversed and all macro forms are expanded by associated
macros. In way macros act like functions that are run during compilation and are handed
a tail of the list who's head was a symbol associated with a executed macro. That way
these macro can do any kind of transformations on the handed list and return back
another list if returned list still contains macro forms it's expanded again and again, until
form reaches a state where all the head symbols are either special forms or names of
functions to be called. I'm shifting a little into macro discussion here, but I suspect that
desire of homoiconicity was related. Lot of other special things like .apply
in js can be
then trivially expressed as macros
I would really advice taking time to watch these two videos, Rich Hickey goes through and
explains language forms, type and how reader evaluator works. I think a lot of it really relevant:
http://www.youtube.com/watch?v=ketJlzX-254
http://www.youtube.com/watch?v=sp2Zv7KFQQ0
All in all I do think that starting with a very minimalistic subset of lisp / scheme / clojure syntax would be better, since it's not too different but have being time tested + lot's of
tooling and knowledge has being accumulated over the years.
Part of me also wished it data structures were immutable by default. I really love how
powerful and natural it could be with a right language integration (clojure is an example
that comes in mind). I definitely see reasons for mutations, but making that a second
class makes a lot of difference.
Thanks for the feedback. I can't respond inline here so I'll try to address your comments.
First I feel it would be useful to state my goals. That would explain a lot of the design decisions I've made.
My goal is to make a super simple and minimal scripting language with syntax that is familiar to JavaScript developers. I want it to be easy to learn and easy to implement. This is probably because I'll be implementing it for a few targets and I'm always teaching programming.
My two main targets initially will be an interpreter implemented in JavaScript that can run in node.js and the browser. Then I'll write a C VM that is trivial to embed into any program. I'll make them fast where it's not a lot of work, but I won't sacrifice maintainability of code to make it even faster. This mean I probably won't write a JIT engine any time soon for any target. Also since the data structure is the code, there are limits on what kind of re-writing I can do. The changes would be visible to users. This does mean, however, that user-space libraries can do optimizations on the code by rewriting functions.
I don't want lisp syntax. As nice and simple as it is, there are negatives to it as well. In my opinion, the lack of syntax is why lisp never really became that popular. It's hard for a developer to see the structure. The learning curve is rather steep, especially for anyone with experience with the popular languages.
I do want the ability to represent code as data structures. This makes all kinds of meta programming possible and is really fun too. The fact that my representation is very lisp-like is a side effect, not a goal. S-expressions are simply a great way to represent an abstract syntax tree.
Now regarding Buffers, Strings, and Lists acting the same, I absolutely agree. In the latest version of types.md I allow the same operations on all three where possible. There are differences as well though. Strings are immutable, buffers are not, and lists are obviously mutable. The idea to make list operations work on all types is interesting. I could make the "#" (length) operator work on all types and return 1 for most of them. Also [] indexing would allow indexing at 0 to get the value. I'm not sure how useful this is. Do you have any examples of where it's super convenient to treat all types as lists?
I do have a list of all falsy values. Currently it's only "nil" and "false". I treat 0 and "" (an everything else) as truthy values.
The symbols are weird. I've contemplated instead using a special @var special form, but that doesn't change semantics, just makes it more verbose. I think since they are so common, it's fine using the special syntax of :sym.
The weird part is where concrete syntax is mixed with building data structures of the abstract syntax. What does the following compile to?
let x = 2
let plusTwo = [@fn nil ["x"] [@add :x x]]
plusTwo!(2) --> 4
There is both a literal :x symbol and a x variable. The intent is to evaluate the current value of x and create a function that adds 2 to whatever is passed in. What I've come up with so far is to allow multiple levels of symbols. This 3-line program would compile to the following program:
[@fn nil []
[@let :x 2]
[@let :plusTwo [@fn nil ["x"] [@add ::x :x]]
[@call :plusTwo 2]
]
Note the difference between "::x" and ":x" It's like double escaping the symbols so they become live variables at the right time.
I've since renamed @fn and @fni to be @def and @fn. Only [@fn ...] expressions can be executed and their second value is the lexical scope they were created in. In my couple examples above I left it nil to keep things simple. Normally people will write functions using the concrete syntax form.
let square = {|x| x * x}
Which is compiled to:
[@let :square [@def ["x"] [@mul :x :x]]]
And when this line is executed as part of a function, the variable "square" in the local scope will be assigned the value:
[@fn localScope ["x"] [@mul :x :x]]
So that [@def ...] forms are the function prototypes and the [@fn ...] forms are the function instances with live lexical scopes. This allows me to implement closures. Also it makes it possible for users to manually create new closures and treat them as mutable objects. This is both powerful and scary. I'm excited to see how it's used in practice.
Thanks for the links, I'll look at them tomorrow when I'm not surrounded by sleeping children.
Oh regarding the dual list/map type. I did that to simplify the language, but now I think it complicates things. I've worked out how they work in my semantics an it was a lot more complicated than I expected. Though there are some benefits. I realized I can easily add named arguments to function calls. Just enable alias syntax in the call and treat the arguments list as a list. The function form that declares the argument names can treat those names as alias offsets.
Also, FYI a couple days into this there was a commit where I gave up and just put in a note to learn clojure instead. Then after a day of that I deleted my comment and went back to designing my language. I mostly know what I want now and I'm excited to make it.
My goal is to make a super simple and minimal scripting language with syntax that is familiar to JavaScript developers. I want it to be easy to learn and easy to implement. This is probably because I'll be implementing it for a few targets and I'm always teaching programming.
Yeah I have not realized it was for teaching purposes so I do understand objectives a lot better now.
My two main targets initially will be an interpreter implemented in JavaScript that can run in node.js and the browser. Then I'll write a C VM that is trivial to embed into any program. I'll make them fast where it's not a lot of work, but I won't sacrifice maintainability of code to make it even faster. This mean I probably won't write a JIT engine any time soon for any target. Also since the data structure is the code, there are limits on what kind of re-writing I can do. The changes would be visible to users. This does mean, however, that user-space libraries can do optimizations on the code by rewriting functions.
I don't want lisp syntax. As nice and simple as it is, there are negatives to it as well.
I do understand that, what I was suggesting to just use lispy syntax for AST and the have more traditional
syntax(es) that would just desugar to it. Although from your reply I get an impression that you want to do
code transformations at runtime rather than compile (or to be more precise macro expansion) time, which is
different from what lisp does. I personally find it scary, probably because I strive towards stateless code, while
this adds yet another layer of mutability :D Not only state can change while your program runs, but even function
source :)
In my opinion, the lack of syntax is why lisp never really became that popular. It's hard for a developer to see the structure. The learning curve is rather steep, especially for anyone with experience with the popular languages.
I agree this is a part of the reason, but I think bigger role in it's failure played the fact that lisp was way over the
head for the hardware when it was introduced. In consequence it was no match in terms of performance to other
mainstream languages. Either way it's just hypothesis not really relevant :)
I do want the ability to represent code as data structures. This makes all kinds of meta programming possible and is really fun too. The fact that my representation is very lisp-like is a side effect, not a goal.
I do understand that, I was just suggesting to take full advantage of that side effect and leverage something that
has being battle tested again just for the AST not to actually have a lisp syntax.
S-expressions are simply a great way to represent an abstract syntax tree.
But as far as I understand you don't actually do have S-expressions. What I mean is that the fact that every list evaluates to value (if foo bar baz)
is super handy, which is not the case for many mainstream languages (if, switch, while
blocks don't evaluate to values) and there for force stateful approach. Don't know how much value
would that bring to this project though.
Now regarding Buffers, Strings, and Lists acting the same, I absolutely agree. In the latest version of types.md I> allow the same operations on all three where possible. There are differences as well though. Strings are immutable, buffers are not, and lists are obviously mutable. The idea to make list operations work on all types is interesting. I could make the "#" (length) operator work on all types and return 1 for most of them. Also [] indexing would allow indexing at 0 to get the value. I'm not sure how useful this is. Do you have any examples of where it's super convenient to treat all types as lists?
Yeah I forgot about mutability, it works for clojure because everything is immutable. As of examples yes I do although it hard to explain without actually experiencing it. I've being writing a lot of stuff in terms of reducers
which is inspired exactly by that feature of clojure. The cool thing is that this allows is to define logic in terms of
abstractions without being bound to a specific data types. There are bunch of reducer based libs:
https://github.com/Gozala/reducers/wiki/What-can-I-reduce
But I'll point out only few to make a point. Reducers work with all the built-in data types and can also represent
things like node streams or dom events. New things also could be made reducible and callback-reduce does
exactly that. In short this means that I can concat, map, expand anything without caring what exactly that thing is
weather it's stream, number, DOM events, or maybe ever eventual result of some callback based API call. In fact
you can not only abstract over data types but also time. What's cool about this is that support for new types can be
added without changing either your functions or those types. That's how stream-reduce make node streams
reducible. This also overlaps little with clojure protocols which is another interesting subject and you can check it
out if curious (A lot better then OOP method IMO)
The symbols are weird. I've contemplated instead using a special @var special form, but that doesn't change semantics, just makes it more verbose. I think since they are so common, it's fine using the special syntax of :sym.
The weird part is where concrete syntax is mixed with building data structures of the abstract syntax. What does the following compile to?
Yeah as already mentioned I have not realized you were planning doing runtime transformations and I see how
it becomes weird. Have you considered lisp like read -> expand -> compile flow ? In which case all such forms will
be expanded before compile time making it non-issue, because reader will first desugar your syntax forms back to list forms then expand then compile. Although in order to embed AST forms you'll have to quote them like lisp does:
http://en.wikipedia.org/wiki/Lisp_%28programming_language%29#Self-evaluating_forms_and_quoting and in order
to execute just unquote :)
In fact what you do is very similar with a diff that you quote each symbol and use different chars to do that. You can
also quote each symbol in lisp & in some cases it's handy but in most quoting whole list and unquoting parts of it
is a lot simpler.
Oh regarding the dual list/map type. I did that to simplify the language, but now I think it complicates things. I've worked out how they work in my semantics an it was a lot more complicated than I expected. Though there are some benefits.
Yeah on thing that actually annoys me little about lua is that table is both array and map and it can be confusing
how is it meant to be used. It very well could be my inexperience speaking though.
Also, FYI a couple days into this there was a commit where I gave up and just put in a note to learn clojure instead. Then after a day of that I deleted my comment and went back to designing my language. I mostly know what I want now and I'm excited to make it.
:D So it was so close... Either way I think clojure is really worth learning it's a very well balanced mix of lisp & haskell IMO.
Please don't take my comments as DO LISP it just as you mentioned yourself there is quite an overlap and lisp
had already figured bunch of things out, so it may be useful to take advantage where it makes sense. As a matter of
fact this got me curious and I might even try to add traditional syntax to wisp and see how they can [cohabitate][]
together, I was going to add support for JSON syntax anyway ;)
P.S.: Inline comments are easy just copy text and add >
in front of paragraph you're replying ;)
Alright, I've implemented enough to start writing the interpreter and I now have a much firmer understanding of how my own syntax works. I'm not sure what the proper terms are, but the sugar syntax is simply a compile-time macro built-in to the parser. So the language has no idea if you wrote the long-hand or the short-hand. https://github.com/creationix/jack2/blob/master/sample.jk
For example, the following are two identical ways to express the same thing:
let number = 42
and
[@let number 42]
Notice that I don't need a symbol in either form. If I just want to create a list that contains the code, I would have to do something like:
let code = [nil :number 42]
code.0 = @let
So while you can manipulate the code at runtime, it's an advanced technique and not super easy in the language. Also the function definition form could be used.
let code = {|| let number = 42 }.3
I'm still working out how this works in practice.
Now regarding the list type, I think I will change the language to have a map and a list type as two types. In practice it complicates things, not simplifies them.