eclipse-archived/ceylon

add F#-style "pipe" operator |>

Closed this issue · 91 comments

Several times in the past, users (including me!) have requested an F#-style pipe operator, which accepts any type T has the left operand, and a unary function type F(T) as the right operand.

val |> fun

would be an abbreviation for:

let (_ = val) fun(_)

Thus, you could write print("hello") as "hello" |> print.

If I'm not mistaken, |> is naturally left-associative.

I'm now favorably-disposed toward this proposal, and the syntax seems viable. It looks like it can be implemented by desugaring and need not impact the backends in the initial implementation.

An open question is: is foo |> bar considered a legal statement? Can I write this:

void hello() {
    "hello" |> print;
}

I would say that we should accept this code.

Discussion on Gitter led to me proposing two additional variations of this operator, by analogy to the existing ?. and *. operators.

  • maybe ?|> fun would propagate nulls from left to right, being equivalent to if (exists _=val) then fun(_) else null
  • tuple *|> fun would spread a tuple over the parameters of an arbitrary-arity function, being equivalent to let (_=val) fun(*_)

(Note that these operators could also be easily defined in terms of |> and a higher-order function. For example, tuple *|> fun is equivalent to tuple |> unflatten(fun).)

Both of these are useful—I've wanted something like ?|> many times when writing real Ceylon code—but the objection was raised by @someth2say that they're too ascii-arty for this language. That's a reasonable objection, and we should give it some weight.

Feedback?

As usually I'm skeptical about introducing new operators that don't exist or don't have the same syntax as in other languages (or do they?), because although their meaning has a justification, they're still cryptic for newcomers.

I'm also skeptical about their value. I don't think I've ever wanted them. As a comparison I've wanted the .. operator in #4049 a lot (order of magnitude) more often (at least in Java), which is similar except it sticks to a single return value.

I'm not opposed to them, just very skeptical, especially when foo() |> bar is much clearer as bar(foo()) IMO. Similarly for foo() |*> bar versus bar(*foo()), I don't see the point. Unless the tuple can be null and it behaves like |*?> and the my next argument applies to it.

I do believe foo() |?> bar has value over if (exists f = foo()) then bar(f) else null, but I am afraid single-argument functions are the exception and not the norm, so if it's the second argument that can be null and should avoid the function call, it's useless.

I'm not opposed to those operators, but I'm not convinced at all yet. Just waiting to be convinced ;)

I don't have an opinion about |> yet. I've always thought about a compose operator, which I've wanted at times.

But I do think ?|> would be GREAT. I'm constantly wanting to map over optionals to avoid the if (...) then ... else null pattern.

but I am afraid single-argument functions are the exception

If I'm not mistaken, the multi-argument use would look like:

Integer? x = ...;
Integer? y = x ?|> ((x)=>times(x, 5));

@FroMage My feeling is that we would introduce |> as a first step. I don't think I would add ?|> and *|> initially, since I kinda agree with the thinking that they're potentially cryptic.

I agree that bar(foo()) is clearer than foo() |> bar, however, I don't think that reasoning holds when it's

bar(foo(baz(fee(fi(fo(fum))))))

vs

fum |> fo |> fi |> fee |> baz |> foo |> bar

I think foo() |> bar might be too simplistic to consider this operator as useful. What Gavin didn't mention is that it would be chainable. Consider the following example:

request |> parseParameters |> validateParameters|> doStuff |> writeResponse

vs

Request request = ...;

value params = parseParameters(request);
value paramsAgain = validateParameters(params);
value output = doStuff(paramsAgain);
writeResponse(output);

fum |> fo |> fi |> fee |> baz |> foo |> bar

OK fine, true that's better, but again assumes single arguments is common.

Integer? y = x ?|> ((x)=>times(x, 5));

OK but if you want to chain this more, it's really starting to smell, especially if you have a chain of heterogeneous number method arguments.

OK fine, true that's better, but again assumes single arguments is common.

Well that's the point of *|> if it wasn't clear. You could write stuff like:

[foo, bar] *|> fun |> otherFun

But sure, this is optimized for the case of one argument, that's for sure.

assumes single arguments is common

Until you start working with RxJava, Streams API, Spark, or any other functional API :)

I can't sleep thinking about how powerful can this be, and how much I dislike the |> syntax.
I've been thinking that the purpose of this is being able to define the "calling" of a "sequence" of functions with an initial parameter(s).
This almost naturally drive me to use both already known constructs in the language: sequence and method calls.
I propose the following syntax:
{ fo ; fi ; fee ; baz ; foo ; bar }(fum)
Being fo...bar the functions, and fum the initial value.
This syntax expresses clearly the order of applied functions, is compatible with ? and * and, IMHO, is more Ceylonic than ascii-art operators.
Also, I can see many other advantages for this syntax:

  • The body can be understood as a function definition itself, so it can be directly used for functions:
    function chain => { foo ; bar ; baz }
    Yes, you can also do the same with |>, but does not look that good for me.
  • Not only single initial value can be used, but anything like a function parameter set:
    { times ; Integer.string } (x , y)
    In fact { f1 ; ... ; fn } is a function that accepts the same parameters than f1 and returns the same type than fn.

Thoughts?

Now that I think a bit more about this, I found the ? prefix does not smoothly fit with multiple parameters. { ?foo; bar } (x, y)

? usually means "if input exists, then use it for the following". But then multiple parameters can be used, the input will be a collection of parameters, not just a single one. And this collection will always exists (say can never be null).
On edge cases, the parameter collection may be empty, contain a single null parameter, or contain many null parameters. None of those cases do really match the current ?semantics.

Anyway, as we are proposing the syntax, we can also adapt the semantics to our needs.
Some options are:

  1. Always refuse ? on first function. I don't really like it, but is is feasible,
  2. Accept ?on the first function iif it only accepts a single nullable parameter. ? then will check for only for this parameter.
  3. Accept ?on the first function, and redefine it to check non-nullness for all parameters.
    I slightly prefer 3), but will accept 2).

I'd like to remind people that a compose like operator as @jvasileff would give 99% of the expressive power with minimal language changes - there are several operators you can't use on functions waiting for reuse.

I also achieved a compose function that composes functions of a single type, and it's reverse case, quite trivially

"Chain homogeneous functions in order last to first called (or outer to inner calls)."
shared X(X) chain<X>(X(X)+ functions)
        => functions.reduce(compose<X, X, [X]>);

If some type wizards could extend it to accept X(Y) this issue would boil down to an SDK call. But I doubt that's feasible, or only as compose2, compose3, compse4... which I find bad enough to warrant an operator. I think * or + would be fine:

function chain => foo + bar + baz;
(foo + bar + baz)(arg);

If I'm not mistaken, spreading would happen naturally.

It would be more effective to have monadic for - then to have the pipe operator (which is only for single argument functions). If at all, it is necessary to have a right associative version of concatenation for the creation of immutable data structures. But I know that you will not accept this proposal, because it make a whole bunch of addition to ceylon necessary or at least desirable: Implicits, no constraint of e.g. Summable interface (chain different types), right associativity, ascii art (there is no justification against ascii art, e.g.why is there a fixed set of operators for any class??), operator precedence definition (I can really understand than you dont like that - it makes a language a horror) ... have fun - but you will not have, and for some reason it is good so.

@simonthum the source of the discomfort with using + or * to represent function composition is that functions don't form a semigroup. OTOH, as you've observed, functions of matching input and output type do form a semigroup (a monoid, even), so the notation would be reasonable in that case. But note that the pipe |> doesn't demand that input and output types be the same, which makes it a lot more generally-useful for representing a sequence of data transformations.

Hi, @welopino

the pipe operator (which is only for single argument functions)

Well, that's not quite right: the proposal I've presented above is not just for single-parameter functions. Though, of course, it's most natural in that case.

But I know that you will not accept this proposal, because it make a whole bunch of addition to ceylon necessary or at least desirable:

No, that's not right at all. Ceylon already has higher-order generics, so the type Category (and even Functor/Monad) can be easily represented in our type system. But it doesn't seem to me that that's necessary in order to solve the problem that we're looking at here. It looks like overkill, frankly.

Implicits

I don't see what "implicits" have to do with this at all. The problem of abstracting over the notion of "composition" at a high level is a job for higher-order generics, not for implicit type conversions.

no constraint of e.g. Summable interface (chain different types)

What you're trying to describe here is the higher-order generic type Category, it seems to me. We could certainly add such a type, but I don't see how that really solves the immediate problem.

ascii art (there is no justification against ascii art, e.g.why is there a fixed set of operators for any class??), operator precedence definition (I can really understand than you dont like that - it makes a language a horror) ... have fun - but you will not have, and for some reason it is good so.

Well you went all ranty at the end here, and I can't tell what this has to do with the issue we're discussing.

There has been some discussion between @someth2say, @luolong, and myself on the gitter channel. Both @simonthum and @someth2say have been pushing in the direction of a syntax for function composition, instead of the F#-style "piping" at the value level.

It seems to me, however, that |> can do both. It looks like |> is a perfectly well-defined associative binary operator meaning:

  • application, if the LHS is of type X and the RHS is of type Y(X), and
  • composition, if the LHS is of type Y(X) and the RHS is of type Z(Y).

I can't seem to be able to construct any case where this would be ambiguous and/or non-associative right now. Perhaps I'm being dense, and there's something obvious that I'm missing.

there's something obvious that I'm missing

Well, OK, there's this one:

  • LHS: Object(Object)
  • RHS: Object(Object)

That case is indeed ambiguous :-/

Well in fact, anything that

  • LHS: Callable(A,B)
  • RHS: Callable(X,Y) given Y satisfies Callable(A,B)
    In other words, if RHS accepts as Y a parameter, and LHS satisfies Y, we have an ambiguity.

I can see several approaches for solving the ambiguity:

  • Use the ( X |> Y |> Z)(param) syntax (yes, I know I am being boring). This implies |>will never means application, just always composition.
  • Using the fact 1) this can only happen in first position for the |> chain, and 2) LHS should satisfy Callableto find the ambiguity.
    If this situation is found, then evaluate |> as one of them (i.e. application).
    If this happens, then just force composition (this may be tricky for the typechecker)
  • Disallow using a Callableon first position for the |> chain.
    This will disallow things like function comp(Object param) => param |> X |> Y |> Z, but we can live with it.

In other words, if RHS accepts as Y a parameter, and LHS satisfies Y, we have an ambiguity.

Well, yes, of course. And can you think of any other type than Object or Anything that would satisfy you that? I can't.

... |> print does seem useful though.

A couple other interesting examples:

class X() satisfies X(X) {}
Float f(X g(X x)) => 1.0;
Float | Float(X) whatIsIt = X() |> f;

(similar for class X() satisfies X() {})

and this, which is an extension of the Object(Object) example:

Anything(String)(String) a => nothing;
String(Anything(String)) b => nothing;
String | String(String) whatIsIt = a |> b;

Interesting.
First example:
Assuming you can satisfy Callable (currently you can't), you are in the ambiguity previously described.
You can't decide if X (the Callable) or X (the parameter) will be the parameter for f.

Second example:
Can be rewritten as:

alias SToA => Anything(String);
SToA a(String str) => nothing;
String b(SToA stoa) => nothing;

So in this example, desugaring |>operator, you have (allow me to abuse of <=>)

a |> b <=> b(a) <=> String(SToA(String)) <=> String(String)

So whatIsIt is an String(String).
Quoting myself:
( f1 |> ... |> fn)

is a function that accepts the same parameters than f1 and returns the same type than fn.

Agree, I've been a bit a cheater here, creating the SToAalias to avoid currying a, but this way is clearer, IMHO.

xkr47 commented

just gotta say I don't especially like |> .. it would be nice to use unicode characters....

Unicode characters are notoriously difficult to type...

@xkr47 I like unicode too, but it seems time for exclusive non-ASCII tokens in a programming language has not yet arrived. However, non-ASCII tokens could be alternate forms of ASCII ones: for example, Haskell has a compiler extension (and a source code directive) for that.

I actually proposed support for ∨ ∧ ∀ ∃ ∈ ≤ ≠ etc. a while ago ;) I wasn’t entirely joking, but sadly nothing came of it.

Oho, it seems I have seen that once!

I wonder what other “sourcewise” compiler directives could be made—if there aren’t, this approach for unicode styling would likely be forgotten a second time. 🤔

Rly? Are we goin brainfuck now?
I should say I like how it looks like, but please, not for Ceylon!

Btw, I made a headfish alternative, in case you find it useful: https://github.com/someth2say/ceylonChain

xkr47 commented

@someth2say WDYM, isn't that ascii-only?

xkr47 commented

.. but nice implementation!

xkr47 commented

I'm getting a bit annoyed by the three-letter operators.. ?|>

I think they are only borderline ok at being rememberable.

I mean I sometimes have problems remembering regexp operators like (?<= (?! (?<! etc, particularly in which order those characters were supposed to be again.

Have you considered using some keyword instead like... given for example?

I kind of agree, but just partially.

I do hate ascii-art operators like this one. But, as commented on gittter, |> operator family have the pro that represent both pipe (like in shell) and a direction or flow. Those are really significant.

On the other side, three chars operators are actually hard to remember. Even in this same thread you can find both ?|> and |?>! That's actually a big con, IMHO.

You proposed using a keyword like given. Despite of the fact this one is already taken (for describing type parameters), how you propose to use it?

Finally, in my test implementation, I proposed the andThen keyword for chaining chainingCallables. My surprise java 8 already uses that same name for a really alike construct: chaining functions.
https://docs.oracle.com/javase/8/docs/api/java/util/function/Function.html
So if this construct already exists, maybe building something alike will facilitate adoption.

xkr47 commented

well something like

// instead of if (exists v = func1(val)) then func3(*func2(v)) else null
val given func1 ?given func2 *given func3
val in func1 ?in func2 *in func3

For |> I would similarly use the ? and * as prefixes since that's how they are in ?. and *. too, and also it keeps the base operator unchanged.

Well, I've been proven not good in proposing syntaxes, but that one feels hard to read for me.
Also, as discussed with Gavin, this approach can drive to ambiguities when initial value is also a Callable.
But maybe something in the middle can be used:

val appliedTo func1 ?andThen func2 *andThen func3 ;

That is, using appliedTo to identify initial argument, and andThen family operators for chaining.

But being true, now I see it written down, it does not look ceylonic at al...

Just a general comment; whether you go for composition or pipe-like application, please don't address them in one syntax. Ambiguous or not, I would like to "see what is happening". Having one syntax for composition and pipelining is countering that abitlity. I regard that ability as a comparative strength of ceylon.

I could go on arguing but people smarter than me already did that, so I'll just drop this bit from ceylon's home page ;)

PREDICTABLE

Ceylon controls complexity with clarity. The language eschews
magical implicit features with ambiguous corner cases. The
compiler follows simple, intuitive rules and produces
meaningful errors.

<advertising:on>
I just pushed to herd a tiny module for implementing this functionality: https://herd.ceylon-lang.org/modules/herd.chain
Instead of using ascii-art operators, I'm using methods in chaining step objects.
Syntax may be not as clear as the ascii-art one, but pretty ceylonic IMHO:

chain(parseParameters).\ithen(validateParameters).\ithen(doStuff).\ithen(writeResponse).with(request);
    // request |> parseParameters |> validateParameters |> doStuff |> writeResponse

chain(iCanReturnNull).thenOptionally(doNotAcceptNull).with(initial);
    // initial |> iCanReturnNull |?> doNotAcceptNull

chain(whatever),thenSpreadable(iReturnATuple).spreadTo(iAcceptManyParams).with(initial)
   // initial |> whatever |> iReturnATuple |*> iAcceptManyParams

More details and usage in README.md
Comments are welcome.
<advertising:off>

\o/

Hi there :)

This operator is one of the most used ones if F# and Elixir for a reason: Its super useful to compose functions as shown and makes the code very readable.

I can recommend Fire Code for all who are skeptical.

Thanks a lot for this fantastic language ^-^

@xkr47 ah, thanks, that's indeed an important development! If JS adds this operator, then that's a rather strong argument in favor of adding it to Ceylon, since it will be much more widely known and understood.

I'm also very interested to see the variations they're considering with partial application, something we didn't consider in the discussion above.

@xkr47 cool.

I'm also very interested to see the variations they're considering with partial application, something we didn't consider in the discussion above.

Interesting... I need to find some time for trying to add that idea to my library!

xkr47 commented

I'm not sure in which state the proposal is; I guess it can still change..

To me, there is very less difference between the casual method without |> and this chain to method: https://github.com/someth2say/ceylonChain

I am a visual type, and the fsharp code in your readme shows clearly, how the code flows and is very concise. 🙂

Is there a chance, to get it like this?

@ShalokShalom thanks for checking the library 😄
For sure, F# approach is much more clear and concise (despite I am against ascii-art in code).

Including the fish-head operator (|>) in Ceylon implies changes in grammar and in other places that, sincerely, are out of my capabilities 😧 If this proposal ever get accepted, for sure @gavinking or any other team members will be able to do it.

ceylonChain is just a module exploiting Ceylon type-system capacities to emulate |>.
The good point is that ceylonChain is not tied to ascii-art tokens for expressing |> and its basic variants (|?> for optional and |*>for spreading), but also can introduce other variants hard to be expressed with ascii-art: iterable and teeing chaining (and maybe others in a near future, like partial application chaining).

To be honest, I can live with a keyword, its more the ordering which I adore so

Is it possible to simply replace |> by a keyword like 'pipe' and do all the other stuff such as fsharp?

In case of using a keyword, I’d suggest on: f on x is natural, and f1 on f2 on x is not so strange.

Oh, I forgot |> is not $. Then to?

I vote for to

So, folks, this issue seems to have come back to life, so let me clear up why it was something I never ended up implementing back in 2016. Basically,

  1. We wound up with enough question about the scope and semantics of this that I didn't think it felt sufficient "ripe":
    • were we aiming to include ?|>, *|>?
    • was function composition going to be pulled into it?
    • is partial application part of it?
  2. @FroMage just wasn't very enthusiastic about the whole thing.

Now, it seems to me that this is a feature that's increasingly popular in other languages (apart from being something I've always liked) and it even looks like it's going to be part of JavaScript. So I think it's something we could safely move forward with. And I think we should stick with the "consensus" syntax of the other languages we've seen, i.e. |>.

Looking over what's being defined for JavaScript, I would say that a first cut of this could have a very similar scope, that is:

  • the basic pipe operator |>, and
  • partial application,

excluding function composition and the ascii-arty ?|>, *|> for now.

Now, what's interesting about the above to included features is that, if I'm not mistaken, they can be defined as completely orthogonal features. We would introduce a syntax for partial application, something like this:

Integer.format(?,16)

which would desugar to:

(_) => Integer.format(_,16)

These partially applied functions would be usable anywhere, not only in "pipelines".

Similarly, the pipeline operator would be defined as above, namely:

byte |> Integer.format(?,16) |> print

means:

let (_1 = let (_2 = byte) Integer.format(?,16)(_2)) print(_1)

All this should be easy enough to implement via desugaring within the typechecker.

BIG QUESTION: I'm not very happy with ? as the "wildcard character" for partial application, since in Ceylon, ? always has something to do with optional types. Nor do I like *, since that always has something to do with multiplicity. So what alternatives do we have?

  • byte |> Integer.format(~,16) |> print
  • byte |> Integer.format(.,16) |> print
  • byte |> Integer.format(^,16) |> print
  • byte |> Integer.format(#,16) |> print
  • byte |> Integer.format($,16) |> print
  • byte |> Integer.format(&,16) |> print
  • byte |> Integer.format(@,16) |> print
  • byte |> Integer.format(%,16) |> print
  • byte |> Integer.format((),16) |> print
  • byte |> Integer.format(<>,16) |> print
  • byte |> Integer.format(!,16) |> print
  • byte |> Integer.format(:,16) |> print
  • byte |> Integer.format(...,16) |> print
  • byte |> Integer.format(..,16) |> print

Anything else?

Note: of the above listed options, I think I like the twiddle, the dot, and the caret best. And maybe the dollar sign.

Actually no, I think most I like ., ^, or ().

I could live with %,$, or #.

At first, I liked the (), as it somehow represents "parameter list", but now I see it, it feels weird having too many parens...
I already saw the # token in JS proposal, and should admit it was pretty intuitive for me, so# have my vote.

I can live with ^, but not with . (to visually close to ,).

Another thing to have in mind is having multiple partial application parameters. My first idea is appending positional index to the token: (#1,#2) -> Integer.parse(#1,16) + #2, where #1 is equivalent to just #.

Comments?

@someth2say I'm definitely not proposing to allow arbitrary expressions to be treated as functions, via some "magic" identifiers (your #1, #2, etc). What I'm proposing is at most, partial function application with exactly one wildcard.

I suppose one thing that would make |> more convenient would be type inference for anonymous functions occurring as RHSs, for example:

void fun()
        => "hello world"
        |> String.size
        |> ((Integer _) => Integer.format(_, 16)) 
        |> print;  

Could be written as:

void fun()
        => "hello world"
        |> String.size
        |> ((_) => Integer.format(_, 16)) 
        |> print;

I guess that would alleviate some, but not all, of the pressure for partial application.

Ooooooh, wait, remember #7190? That lets us use => to trigger an "implied" parameter. So what about:

void fun()
        => "hello world"
        |> String.size
        |> (=> Integer.format(_, 16))
        |> print;

where _ is an implied anonymous function parameter of type Integer.

No need for any special partial application syntax!

(I realized the connection, when I noticed that the whole #/./()/^ discussion was really a debate over the right name for our old friend it!)

Just throwing it out there, but scanning the above it looks like I could collapse |> (=>....) into a single operator:

void fun()
        => "hello world"
        |> String.size
        |=> Integer.format(_, 16)
        |> print;

Interesting.

Mmm.. I don't think I like the idea of mixing the pipe (flowing results to parameters) with the fat arrow (function definition).
They are concepts to separated to use a single token.

But in case this syntax keep present, I have a question:
Previously you @gavinking presented pipe alternatives as *|> and ?|>, while I saw them as|*> and |?> (that actually are more affine to |=>). Is there any bibliography or discussion in favour of any of those syntaxes?

Also dropping another idea here, it would be great using pipes not only to apply functions to values, but also to compose functions into a single one.

That is, simplifying this:

void fun(String s) 
        => s
        |> String.size
        |> (=> Integer.format(_, 16))
        |> print;

to something like this:

void fun =>
        |> String.size
        |> (=> Integer.format(_, 16))
        |> print;
...
fun("hello"):

or even in assignments:

value fun =
        |> String.size
        |> (=> Integer.format(_, 16))
        |> print;

Still not convinced about the syntax, but I hope you get the idea.

I appreciate the |> and think to collapse |> (=>....) into one single |=> improves the readability significantly and is also nice to look at.

xkr47 commented

Would it not be possible to just use |> instead of the |=> syntax with the same semantics IFF an otherwise undefined _ is used in the expression?

xkr47 commented

(I do like the _ character)

xkr47 commented

@someth2say Another way would perhaps be

void fun(String s) 
        |> String.size
        |> (=> Integer.format(_, 16))
        |> print;

but I do recognize it might be ambiguous and maybe not so clear either?

@xkr47 not completely against it, but I do like the idea that everything between the first |> and the ending ; is kind of a "function definition", the same way method references or lambdas are.
Using this parallelism, following are equivalent:

void fun(String) => print;
void fun(String) => |> print;

But following should both be refused as misses the required =>:

void fun(String) print;
void fun(String) |> print;
xkr47 commented

@someth2say Your first code block seems broken on both lines, some typo?

@xkr47

Would it not be possible to just use \|> instead of the \|=> syntax with the same semantics IFF an otherwise undefined _ is used in the expression?

The problem with that, which we discussed in #7190 was that implicit functions implied by use of it don't play well with nesting. It was the main reason we didn't go down the path of it in #7190.

@someth2say

Also dropping another idea here, it would be great using pipes not only to apply functions to values, but also to compose functions into a single one.

Well as I noted above there are a limited number of cases where that would be ambiguous.

woops! fixed ;) @xkr47

xkr47 commented

@someth2say Doesn't void fun(String) => print; return a reference to the print function? 🤔

Feedback: I like _ as a placeholder, but I still hope _ would someday be treated as a “drop it out” argument name for anonymous functions or pattern parts (as in Haskell). Also I think |=> combination is a bit ASCII-arty.

_ is also used as a wildcard in F# and OCaml. This would make it more comfortable for them to change to Ceylon. Also Elm and Elchemy use it that way, I guess more too.

I realized I f***ed up with my previous example. In order to be compilable, I should have wrote

void fun(String o) => print(o); 
Anything(String) fun = (String o) => print(o); 
Anything(String) fun = print;

Ant their proposed "pipe" form:

void fun(String o) => o |> print 
Anything(String) fun = o |> print(o); 
Anything(String) fun = |> print

Removing the => or the = in any of them should be an error. That's why I discourage @xkr47 proposal here.

@xkr47

Doesn't void fun(String) => print; return a reference to the print function?

Assuming the right syntax 'void fun(String o) => print(o);', here there is no return, but a deffinition for fun function.
But actually the other syntax Anything(String) fun = print; does.

@gavinking

Well as I noted above there are a limited number of cases where that would be ambiguous.

If I recall correctly, ambiguity arises when LHS for |> is a function and RHS is a function accepting LHS-like functions as parameters, isn't it?

Saying we have:

B( A ) F1 = ...
C( B|B(A) ) F2 = ...

Then, the ambiguity is choosing if F1 |> F2 means either:
A) A new function concatenating application of both functions. Say: (A a)=>let (b=F1(a)) F(b);
Resulting type is C(A)
B) Applying the value F1 to F2. Say F2(F1)
Resulting type is C

I can guess two ways to disambiguate:
1 - By grammar, always prioritizing one meaning against another. I think B) is a bit clearer, but can live with A)
2 - Using a different token for "pipe invocation" (meaning applying the initial value to a pipe-shaped function). i.e. >>
So F1 |> F2 is a definition (case A) and F1 >> F2is a pipe invocation (case B).

Thoughts?

@someth2say

Using a different token for "pipe invocation" (meaning applying the initial value to a pipe-shaped function).

Right, I think it's clear that the best solution would be to choose a different operator. To me |> doesn't scream "function composition" anyway.

And in that case, "infix operator for function composition" is a completely separate issue to this one.

@someth2say

Anything(String) fun = o |> print(o);
Anything(String) fun = |> print;

I don't understand the point of either of these examples. They are both equivalent to:

Anything(String) fun = print;
Anything(String) fun = print;

which is shorter and less confusing.

So coming back to my musings from the other night, after letting this percolate a bit:

  1. I wasn't really serious about |=>. I don't think syntax like that really fits into this language.
  2. I'm definitely against any sort of magic it (or #, ., ^, ~, (), or whatever you want to call it) that converts an arbitrary expression into a function, for reasons already raised in the discussion on #7190.
  3. I don't even think I like extending the eventual proposed solution to #7190 to this context, but perhaps I might change my mind on that.
  4. I still think it might be OK to have some sort of it that allows us to partially apply a function of multiple arguments and obtain a function with one argument.
  5. I definitely think it would be totally reasonable and consistent to have anonymous function parameter inference work for pipeline application of functions.

Therefore, our silly toy example would look like this:

void fun()
        => "hello world"
        |> String.size
        |> ((len) => Integer.format(len, 16)) 
        |> print;

which is ultimately not so bad, more parens than I would like, but totally readable and understandable.

Or, with partial function application, like this:

void fun()
        => "hello world"
        |> String.size
        |> Integer.format(it, 16)
        |> print;

where it would be a new keyword. (But probably, for back compatibility, a "soft" rather than "hard" keyword, that is, a magic identifier).

However, I think it's reasonable to break out the question of partial function application into a separate issue.

I visit a couple of functional meetups recently and it turns out that Fira Code is quite popular currently, so I suggest that once you invite a new operator and if that one is undefined there yet, is it an idea that someone either integrates it directly or opens up an issue about it.

This is in either way an operator which is quite common in functional oriented languages, so I suggest that you point out the differences in the documentation specifically, like you do it with classes and so on. 🙂

However, I think it's reasonable to break out the question of partial function application into a separate issue.

See #7351.

@gavinking

And in that case, "infix operator for function composition" is a completely separate issue to this one.

Agree. I'll open

I don't understand the point of either of these examples.

They are not exactly equivalent (despite they are semantically).

Anything(String) fun = o |> print(o);   equivalent to    Anything(String) fun = (String o) => print(o); 
Anything(String) fun = |> print;        equivalent to    Anything(String) fun = print;

Agree the latest is less confusing, and hence my proposal.

However, I think it's reasonable to break out the question of partial function application into a separate issue.

Yes, please

Agree. I'll open

@someth2say nonono, there is already an issue: #3229.

@gavinking woops! Thanks, I'll use that.

@gavinking:

I still think it might be OK to have some sort of it that allows us to partially apply a function of multiple arguments and obtain a function with one argument.

I think it is a wonderful idea! But we should perhaps keep it as a separate orthogonal syntactic concern. One that could be re-used in context of pipe operators (among others)

It would be nice to be able to write function aliases like this:

value fromHex = Integer.parse(_,16);

But we should perhaps keep it as a separate orthogonal syntactic concern. One that could be re-used in context of pipe operators (among others)

Right, of course, that's why I opened #7351 as a separate orthogonal issue.

Hrm, so I suppose it would be possible to get rid of the parens around anonymous functions, and write

void fun()
        => "hello world"
        |> String.size
        |> (len) => Integer.format(len, 16)
        |> print;

If we parsed anonymous function bodies with a slightly higher precedence (higher than assignments and |>. Not sure if that’s a good idea, however. It means that => would no longer have the same precedence everywhere in the language. In particular it means that anonymous functions bodies would have a different grammar to regular functions. You couldn't write: do((x) => y = x). (Though of course you could still write do((x) { y = x; }) which is almost the same number of characters and arguably clearer.)

If we parsed anonymous function bodies with a slightly higher precedence (higher than assignments and |>. Not sure if that’s a good idea, however.

Well, actually, I've managed to do a bit better than that. I've hacked it so that |> has a lower precedence than almost everything, including assignments and anonymous functions. That means that the only "weirdness" is that you can use |> in the body of a regular function or value declaration or in a specification statement, but not in an anonymous function or an assignment expression (unless you wrap it in parens).

That actually feels pretty reasonable to me....

This is now working well enough that you folks can try it out, if you like. For example, the following code:

shared void run() {
    "hello world how are you today, I'm doing great !!!! xxxxxyyy"
            |> String.size
            |> (Integer i) => Integer.format(i, 16) 
            |> String.uppercased 
            |> String.trimmed 
            |> print;
}

prints 3C with both ceylon run and ceylon run-js.

So I'm still not completely finished with this. The following issues remain:

  1. While it's certainly convenient to be able to write
    ... |> (Integer i) => Integer.format(i, 16) |> ... 
    the truth is that the resulting grammar isn't completely "clean", not in a way that you'll really notice as a user of the language, but definitely in a way that will bother me when I have to write it down in the spec. I'm inclined to think it's probably better to have a clean grammar, at the cost of having to leave in the parentheses in
    ... |> ((Integer i) => Integer.format(i, 16)) |> ...
    So I'm thinking of rolling back a some of the work I did yesterday.
  2. I want type inference for anonymous function parameters in pipelines. What that boils down to is adding parameter type inference for immediately-invoked anonymous functions, stuff like
    ((i) => Integer.format(i, 16))(100)
    We've never needed that before because there was never a good reason to define an anonymous function and immediately call it. Now, sure, #7353 gets us part of the way there, but it definitely doesn't work in all cases. Fortunately this looks pretty easy to implement (though perhaps if I'm going to do it, I should bite off #7058 at the same time).
xkr47 commented

the truth is that the resulting grammar isn't completely "clean"

Yeah despite the cleaner look, it only feels natural when you split the |> on different lines. If you one-linify the whole statement then it feels strange that the following |> would not be part of the preceding (Integer i) => Integer.format(i, 16) function.

xkr47 commented

And of course the normal-precedence version would be auto-indented like this:

shared void run() {
    "hello world how are you today, I'm doing great !!!! xxxxxyyy"
            |> String.size
            |> (Integer i) => Integer.format(i, 16) 
                |> String.uppercased 
                |> String.trimmed 
                |> print;
}

.. revealing the precedence thought error
.. while funnily enough still producing the exact same end result 😁

while funnily enough still producing the exact same end result

ASSOCIATIVITY FTW!!!!11!1

So I'm thinking of rolling back a some of the work I did yesterday.

Done. And I threw in an impl of #3229, the >|> operator.

I want type inference for anonymous function parameters in pipelines.

Still TODO.

I've now also added <|< and <|, but what direction should they associate in?

  • for <|< it doesn't matter, since function composition is truly associative, but
  • for <| it does:
    • if it's right-associative, it's just the same as |>, but lets you write your chain in the opposite direction, value-last (not really very useful, IMO)

    • if it's left-associative, it has a completely different usecase: sending multiple values to a consuming function, for example:

       write <| "hello" <| "world";
      

Parameter type inference is done for |>. But I should also add it for >|>.

I've merged this work to master. Still need to mention |> in the spec.

Added to spec. Closing this issue, but please reopen if you run into any serious problem with this work.

Ceylon used to have something like that, in the ancient past, where a b meant a.b, and a b c meant a.b(c), called “operator-style expressions”. We killed it ages ago (8c713a1), because it was a syntactical nightmare.

xkr47 commented

yeah.. interesting thread.. this would have aliased a bit differently but I guess same/similar problems would arise.