tc39/proposal-hack-pipes

`?.` required to be one and two tokens

fuchsia opened this issue · 12 comments

In value |> ?.foo, ?. is tokenised as a single token (OptionalChainingPunctuator) so can't be resolved into ? and . by the grammar.

Using a "unary operator" for this case is a possible workaround, but it's not straightforward because ?.foo matches the extant OptionalChain production. And you've still got to allow for ? . value or (?).value. (If it wasn't for this case, you could just add ? to PrimaryExpression and then disallow it outside a pipeline.)

Also value |> ?.foo?.bar seems pretty confusing. So, as much as I love ?, it's probably the wrong character. (?% isn't a legal expression or a token...)

Other than that, count me as TeamHack. This proposal achieves left to right function evaluation, which is the main goal. And I think making value |> foo(?) look like a function is a gain - the first time you see it, you'll have a good guess at what's going on.

See also tc39/proposal-pipeline-operator#91.

You raise a good point about OptionalChainingPunctuator. It should still be theoretically possible for the parser to distinguish ?.method from value.?method with a special grammar production using OptionalChainingPunctuator, but this would be quite bothersome. (Or we could force the developer to parenthesize, like in (?).?blah, which is also not good.)

I don’t even like ? as the topic token much—it’s quite visually confusing not only with optional chaining but also with the ? : conditional operator and the ?? coalesce operator. We changed the token from # to ? only a few weeks ago, in an effort to make Hack pipes more palatable to some TC39 members. (I’m personally partial to % myself for its similarity to printf format strings, as well as Clojure’s % placeholder in #(…). @ or # is fine too.)

We’ve been delaying serious bikeshedding of the particular token to the future, since TC39 might decide against Hack pipes in general anyway (see tc39/proposal-pipeline-operator#91 (comment)). However, the OptionalChainingPunctuator issue might actually get in the way of implementing the Babel plugin.

Currently the plan is to have the Babel plugin support choosing which token to use as a plugin option, to allow experimentation with different tokens. But if using ? ends up being impractical to implement in Babel, then that might mean that we should disqualify ? as a possible token, anyway. We’ll see.

I plan to keep this issue open until we reach that Babel implementation point.

Just a suggestion, can't we introduce a keyword, like it instead of ? or #? The keyword is vague, so it's not good practice to use it as a variable name at all. I doubt most libraries use a variable like it, and Javascript already uses this, so it's a very good combo. It makes it so much more readable. Getting the sample from Placeholder Bikeshedding.

// Basic Usage
x |> f(it)     //-->   f(x)
x |> f(y)(it)  //-->   f(y)(x)
x |> f        //-->   Syntax Error

// 2+ Arity Usage
x |> f(it,10)   //-->  f(x,10)

// Async Solution (Note this would not require special casing)
x |> await f(it)          //-->  await f(x)
x |> await f(it) |> g(it)  //-->  g(await f(x))

// Other Expressions
f(x) |> it.data           //-->  f(x).data
f(x) |> it[it.length-1]    //-->  let temp=f(x), temp[temp.length-1]
f(x) |> { result: it }    //-->  { result: f(x) }

// Complex example
anArray => anArray
 |> pickEveryN(it, 2)
 |> it.filter(...)
 |> makeQuery(it)
 |> await readDB(it, config)
 |> extractRemoteUrl(it)
 |> await fetch(it)
 |> parse(it)
 |> console.log(it);

 // Optional Chaining
 x |> it?.property

We could also use that, though it's too verbose. I don't mind if we choose ? or # at all, but if they are going to be overloaded or cause confusion, then why not replace them with a keyword?

Edit: I discovered that I'm not the first to suggest this.

Does it have to be a symbol? Why not use a predefined name, e.g. it, like Kotlin has with lambda expressions?

Originally posted by @robbie01 in tc39/proposal-pipeline-operator#91 (comment)

What about the keyword val as an alternative of it? demo

Originally posted by @aloisdg in tc39/proposal-pipeline-operator#91 (comment)

The only keywords available here are ones that already are reserved.

But if that keyword in question only applies in the RHS of the |> operator, wouldn't it slowly be not recommended and then be easily reserved?

Originally posted by @noppa in tc39/proposal-pipeline-operator#91 (comment)

@mAAdhaTTah But isn't this pretty much exactly like the case with async/await? I mean, you can have old (or new) code like this

function foo() {
  const await = 5
  return await
}

and it still works because await is only a keyword inside async functions. Similarily, it/val would only be keywords in RHS of the pipeline, which is always new code so it's not a breaking change.

I can see how new keywords would be unsuitable to use with the standalone partial application proposal since that would make code like

const fooIt = foo(it, 5) // Call foo with "it" or partial application of foo?

ambiguous, but AFAIK this issue is only about the Hack-proposal, where the paceholder usage is limited to RHS of pipeline and thus new code only.

const fooIt = foo(it, 5) // Still calls foo with "it" as always
const bar = fooIt |> baz(it) // Same as baz(fooIt)

It would be a refactoring hazard when moving code inside a pipeline, as it would silently change meaning. Using an existing identifier is a nonstarter.

? has a great pedigree because of its role as the SQL placeholder. But I don't think it's worth torturing the grammar to get it in; convoluted productions will tax future enhancements.

But % isn't problem free. For example, the harmless equality expression x |> %=== 4 ends up being tokenised as [ "x", "|>", "%=", "==", "4" ]

This problem is inherent in using punctuators in the "nullary" position. Javascript just doesn't do that. The closest we get is an empty array or an empty object. And single characters are wont to combine with other characters to form tokens.

A two character punctuator might avoid this, if carefully chosen. My earlier suggestion of ?% is broken by x |> z??%:4 I think %? would be safe, but maybe I've missed something. Another approach might be to go for %0 - handled as a prefix unary punctuator followed by a NumericLiteral, with non-zero numbers being an early error. (And, yes, that would mean % 0 would be legal. Not the end of the world. If it really mattered to people we could add a "no whitespace here" note.)

%= already being a punctuator is a good point. The lexer always longest-matches…

However, @ does not have this problem; there are no other punctuators that start with @. (# also does not have this problem, but I frown at this.#ಠ_ಠ(#), for the same reason I frown at ? ?? ?.?ಠ_ಠ()—although the latter is obviously worse than the former.)

So @ (e.g., x |> @===4, this.#property(@), and @ ?? @.?ಠ_ಠ()) might be the least problematic choice. After all, it’s not like decorators would often appear nested inside expressions often…right? I had forgotten that @(decoratorExpression) function f () { } was valid because DecoratorMemberExpression could be ( Expression )…This would require indefinite lookhead in order to distinguish it from @(arg) within a pipe body…

(For what it’s worth, adding a number like ?0 or %0 or #0 or @0 is pretty much what would happen anyway with the pipe-functions extension, with the 0 indicating the zeroth argument of the function. For instance, arr.sort(+> %0 - %1).)

Now that babel/babel#13191 (for topic token #) is near merging, adding an option for topic token % (my currently preferred choice) is probably next, so I’ve been thinking more seriously about the x |> %==y problem.

The simplest solution might be to add punctuator tokens %== and %=== (so that the lexer will longest-match them whenever they appear), then add productions handling them to EqualityExpression.

More specifically, these productions would also be added:

  • OtherPunctuator :: one of blah-blah-blah % %== %===
  • EqualityExpression : %== RelationalExpression
  • EqualityExpression : %=== RelationalExpression

From the developer’s perspective, x |> %==y would Just Work.


Alternatively, a syntactic annotation could be used, rather than adding new punctuator tokens. Something like:

  • EqualityExpression : %= [contiguous] = RelationalExpression
  • EqualityExpression : %= [contiguous] == RelationalExpression

The confusion issues with using ? are making me more and more loathe to try and commit to it, even if we do solve whatever parsing issues exist.

I'd be fine with %; it does look like this parsing issue is a one-off and won't be a continuing spec hazard, as it's unlikely we'll grow % to do more things.

(I still lightly prefer # overall, but purely for aesthetic reasons. It does completely avoid parsing issues at the moment, but that probably won't persist as we start using # for private keys and similar things.)

@tabatkins: As far as I can tell, # is probably grammatically unambiguous with private properties and even with immutable records and tuples (#{ … } and #[ … ])), unless there’s some ASI possibility I’m missing.

#’s biggest problem is simply visual distinguishability: blah |> this.#foo(#[#, 1]) versus blah |> this.#foo(#[%, 1]). In contrast, modulo % probably will be much less relatively common than private keys, records, and tuples.

I suppose as far as aesthetics go, I’m personally used to Clojure’s #(+ % 1) expressions, which use % as their topic token. And there’s the printf connection too.

But my preference for any particular token is similarly weak. It’s a shame that @ probably got disqualified by @(expression) decorators, which are pretty near finalization, but it’s not a huge deal either.

@js-choi Wouldn't this be ambiguous with Tuples?

const result = [1] |> #[0]
result === ??? /// 1 or #[0]

@mAAdhaTTah: Oh, good catch; you are right. From what I recall tuples are undecided between #[…] and [|…|] (tc39/proposal-record-tuple#10), but if they go with the former then, yes, it is ambiguous. We need to add a check to Babel for that. % might be the clear frontrunner with this.