rabbitmq/erlando

The syntax of cuts

Closed this issue · 17 comments

ferd commented

Given how 'cut' works and the special cases for the syntactic nesting as mentioned in https://github.com/rabbitmq/erlando/blob/bug24025/README.md, it feels to me that the cut syntax should rather be wrapped in a special function call of the form cut(<Exp>) as it is the case with many other parse transforms.

To explain, let's reuse your example:

list_to_binary([1, 2, math:pow(2, _)]).

This, as the text says, would yield

list_to_binary([1, 2, fun (X) -> math:pow(2, X) end]).

and not

fun (X) -> list_to_binary([1, 2, math:pow(2, X)]) end.

If instead the syntax was cut(Expression), you could easily choose between

list_to_binary([1, 2, cut(math:pow(2,_))])

and

cut(list_to_binary([1, 2, math:pow(2, _)])). 

You could also increase the flexibility of patterns such as

 F = [1, _, _, [_], 5, 6, [_], _]
 [1, 2, 3, G, 5, 6, H, 8] = F(2, 3, 8),

Now possibly being

F = cut([1, _, _, [_], 5, 6, [_], _])
[1, 2, 3, [4], 5, 6, [7], 8] = F(2,3,4,7,8)

Which seems rather tedious to do with the current syntax.

Other advantages would include a better fit with currently existing parse transforms. The best example of this is for QLC, where instead of basing the parse transform on its context in order to do things, you simply wrap the list comprehension in a qlc:q(LC) call, or ets' fun2ms where you do ets:fun2ms(Fun). In this last case, it also allows the code to have special cases for shell functions (whose internal structure is more like abstract code) to be able to turn them into match specifications transparently.

Given the advice of some Erlangers on IRC, wrapping things in a cut(Exp) call also makes it easier to parse visually, denotes intent better, is more flexible and serves as a warning sign that magic is happening at this place (and it's not an error!). Without having too much experience with parse transform myself, I will make the audacious call that it might make things easier to parse in the first place for the parse transform, but I might well be wrong on that one, especially since the code has already been written.

Note that this view also aligns better with the Scheme implementation it is inspired by, given the scheme implementation uses a special (cut <exp>) syntax to wrap its use.

Cut is not meant to replace fun abstractions in anything but the simplest of cases. IMO, the more powerful things that you're requesting really need full funs to deal with them. Note that the scheme implementation also takes the same stance: the holes created by <> have to appear in a cut(...) immediately wrapping it (i.e. they can't go deep). This is basically a rabbit hole: if you allow them to go deep then why wouldn't you permit something like:

foo() ->
    1 + _.

bar() ->
    3 = (cut(foo()))(2).

or some such craziness? funs have their place, and cut should IMO be a very lightweight and simple shorthand.

The question of whether or not do have a cut(...) wrapper is less clear cut (HONK!). Certainly, yes, scheme's cut does, but that's at least partly so that you can distinguish cut from cute. Legibility is very important and I'm not yet sure in my own mind whether or not the cut(...) wrapper would aid comprehension: you'd probably want your syntax highlighter to colour it differently. I think that being able to write

F = case _ of
        ...

is extremely elegant and clear, whereas

F = cut(case _ of
            ...
    )

is less so. I think time will tell on this one: just because an expression is prefixed with cut( does not make it any more obvious where the _ are. From a technical pov of writing the code, it makes next to no difference in complexity.

si14 commented

There is another problem that can be solved with an explicit cuts. Look at this:
Loop_F = fun(X) -> handle_http(#http_state{req=X}) end
At first glance it can be rewritten as
Loop_F = handle_http(#http_state{req=_})
but in fact this one is
Loop_F = handle_http(fun(X) -> #http_state{req=X} end)

ferd commented

Another issue. What is the actual result of an expression like

{A,B,C} = {X, _, Z} = Exp

Does the middle expression {X, _, Z} get a cut as the right hand-side expression of {A,B,C}, or is it a regular pattern matching given it's on the left hand-side of Exp?

Why can't anyone see this is a stupid idea?
https://github.com/rabbitmq/erlando/blob/master/test/src/test_cut.erl
This code requires all these comments just to make heads or tails of it...
All of the benefits gained from brevity are instantly lost to obfuscation and incompatibilities with the existing meaning of _

That sounds like a problem on the part of the reader, more than the parse transform.

Once you understand what they do, the only problem I see is visual ambiguity with _'s conventional usage. Having to figure out whether it's on the left or right side of an equals sign is a bit annoying.

@ferd

{A,B,C} = {X, _, Z} = Exp

The _ is in a pattern position - you are matching against it. Thus there is no cut there.

@amtal

I agree - if the erlang parser was far less picky about syntax, another token, one which is unused, would have been a better choice, to avoid the overloading. Writing a few parse transforms is one thing. Rewriting the erlang parser is another ;)

ferd commented

I thought I could reply to your first arguments, msackman. Sorry for the delay on that.

foo() ->
    1 + _.

bar() ->
    3 = (cut(foo()))(2).

It wouldn't make sense to allow the cut in this case because

  1. the way function calls work in Erlang doesn't allow that,
  2. there is no closure to be built with such a form no matter what the existing semantics are
  3. I can't see how this would parse with mandatory and wrapped cuts given you'd have no access to the syntax inside
  4. it would not work with funs anyway

If you look at my suggestions, there is no ambiguity, only a shorthand for funs, basically.

funs have their place, and cut should IMO be a very lightweight and simple shorthand.

This is a fairly good point. While my approach is still shorter and more lightweight than funs, yours is even more so. I still think that being able to clearly identify a cut from afar (and even look them up with a quick search) would be beneficial, also given the ability to nest them.

I'm obviously not in charge of the lib and likely won't take the time to fork and fix, so I'll wish you a nice day and carry on :)

Erlang variables can be [A-Z_][a-Z0-9_@]* so it's possible to easily change cut syntax to __, or @, or _@, or something like that.

@amtal

Indeed. However, _ has the advantage that it can't be bound to, unlike those others. By overloading _ as cut does, I can guarantee that I don't break anyone's existing code. The same couldn't be said by using something like _@ etc. The ideal would be to be able to use some token that is not a variable. Eg. &, or even better, <>

si14 commented

@msackman
Maybe it would be good idea to make both cut() and cut()-less versions of cuts available? You are right in your arguments, but there are at least 2 reasons to make cut() available:
-making not-so-easy cases more clear;
-making possible nested cuts (like Loop_F = fun(X) -> handle_http(#http_state{req=X}) end).

@si14 Sure - I'm not claiming that the arguments I'm putting forward out-weigh the gains of the phantom cut marker, but it's probably something I'd have to do on my own time... not sure. I'll see when I can get around to it. Obviously, the more I actually get to use this stuff, the more I can get a feeling of how urgent this addition would be.

As I was saying on IRC, and you probably weren't there, don't overuse the < and > characters. They're already used for <<, >>, =<, >=, and it's already annoying that =<< causes errors. & is much better choice which is what I proposed on IRC last night. Of course I don't think either can be used because it prevents compilation of the file as they're not defined to begin with.

@essen, yeah good point about < and > already being problematic. Again, the issue is the parser... someone really needs to rewrite that thing.

While on the subject of tagged cuts, does anyone have thoughts on the "cute" variant?

It evaluates arguments when the fun is created, rather than when the fun is called. A concise way to lift invariant work out of loops.

For anybody else who stumbles on this issue, here's my attempt at implementing a cut/cute wrapper in order to resolve the above: https://github.com/jkrukoff/partial

Team RabbitMQ considers this project abandoned and will archive this repository. Anyone can feel free to push the code to a new repo and continue maintenance. Thanks!