Consider having syntactic lvalue functions/operators in the language
masak opened this issue · 2 comments
More and more, it feels like Alma is turning into a language for talking about extending the language, and that certainly includes lvalues (#214), which are not first-class to the muggle user but are quite important for the macro author.
But lvalues have an interesting omission: when we talk about access paths like x.y().z[w]
, note that all of those parts are typically lvaluable, except the y()
part. Why is it that function calls are the only things in access paths that can't be the final part that's assignable?
From that question springs the obvious idea of making functions lvaluable.
@lvalue
func foo() { ... }
# meanwhile, later
foo() = 42;
First, and most importantly, this feature is syntactic, a term we're also seeing pop up more and more. (#158, #210, #517) It kinda means the opposite of "first-class" — if something is syntactic, you can't carry it around, at least not in such a way that it escapes and the compiler can't track it anymore.
For lvalue functions, we can be even more restrictive than that: you have one shot to assign, namely just after you called the function. If the thing immediately surrounding the function is an assignment, then you're allowed to assign. Otherwise, the function feigns incomprehension, and you're out of luck.
How will this work in practice? Here's an idea: a function is generated automatically, parallel to your (rvalue) function. It has a gensym'd name which you're not allowed to ask about, but we can think of it as being called lvalue_foo
or some such. Since the compiler can know statically both that the function is @lvalue
-annotated and that the call is surrounded by an assignment, it can go and replace the call as one of its AST phases.
I think that doing it this way makes the abstraction "zero cost", in the sense that we're never dealing with Locations
that escape, and therefore we can always collapse them to simple assignments on the opcode level.
Actually, let's try to be a bit more precise in mapping out this idea.
-
There are lvalue producers. These include variables, array indexings, and object slot accesses. Basically anything that a #214
Location
would allow you to set. Thanks to this issue, the set of lvalue producers is open and also includes functions/operators you define that has an@lvalue
annotation. If you annotate your function/operator with@lvalue
, the compiler makes sure you're also returning an lvalue producer at every routine exit. -
There are also lvalue consumers. The obvious example is
infix:<=>
, although it's a bit special/axiomatic, and probably needs special lower-level treatment. More about other lvalue consumers below. When we say " If the thing immediately surrounding the function is an assignment" above, "assignment" should actually be widened to "lvalue consumer".
We now turn to which parts of Alma, besides assignment, would already benefit from being considered lvalue consumers. This list grows more controversial as it progresses.
-
prefix:<++>
(andprefix:<-->
). This operator updates an lvalue, and could just return the resulting rvalue, but where's the fun in that? (To do: check which languages actually return an lvalue here. I know Perl 6 does it...) -
Array slices. (#291)
-
infix:<&&>
,infix:<||>
,infix:<//>
, and (very carefully)infix:<^^>
. Yes, Perl 6 does this one too. I'm not claiming that any of this leads to readable code. The argument is more It's There, and Therefore We Should. -
infix:<?? !!>
or whatever the heck the operator is called. -
my s = "replace me"; s.substr(7, 3) = "d";
— we'd need to return a user-createdLocation
that's very much like a Perl 6Proxy
, in that it knows how to replace the underlyings
string. Again, the compiler will just inline such an object. -
(string ~~ /regex) = "replacement";
— I dunno, I find this one charming. No need for thes///
syntax. Note, interestingly, that the parentheses are not necessary, but contribute to a "visual pill" that improves readability. I don't believe I've encountered unnecessary parentheses that I favor before.
As we disappear in a mist of total handwaving, it's interesting to note how the above seem to care about the "lvalueness" of some of their parameters, and how this information needs to be propagated to the compiler when a routine is used as an lvalue, so that 5 && x
is fine as an rvalue expression, but (5 && x) = y
isn't, because 5
is not lvaluable.
Normally, type information works its way from AST leaves up to the root, but lvalue calculations are clearly contextual, and work the other way, from the outside and into expressions. In the above case, it's the =
that finds the 5
wanting.
Related: Ref returns and ref locals
An idea I had a while ago was to have the syntax be this:
func substr(str, start, length?) = repl {
// ...
}
That is, the additional = repl
part indicates that this is an lvalue function. Clearly, this is a "definition follows use" kind of feature design. The inspiration from that comes from how Common Lisp allows setf
definitions.