c42f/Underscores.jl

Interaction with indexing brackets

mcabbott opened this issue · 7 comments

This is very handy:

g = @_ gradient(fun(_,y,z), x)   # gradient(x -> fun(x,y,z), x)

but this was a bit of a surprise:

g  @_ gradient(fun(_,y,z), x)[1]   # (x_ -> gradient(fun(x_, y, z), x))[1]

Is it obvious that this should be treated as the outermost call? It's visually last not first, it's very unlikely that you want to call getindex with a function, and the expression :((gradient(fun(_, y, z), x))[1]) is not equal to :(getindex(gradient(fun(_, y, z), x), 1)).

One slightly awkward work-around is:

g  (@_ gradient(fun(_,y,z), x) |> __[1])
c42f commented

Good point, thanks for opening an issue. I think this is also similar to the visual inconsistency in #6: _ is most usable when it expands inside the outermost-but-one "thing which looks like a call". But as you've pointed out getindex syntax doesn't really look like a call and in any case is unlikely to accept arbitrary closures.

So yes, I think this should potentially be fixed as part of a general cleanup where we disqualify every syntax without round brackets from acting as a boundary for _.

OK, I have fiddled a bit and this does not seem difficult to make work. Perhaps the next question is what should happen here:

@_ data |> @view gradient(fun(_,y,z), __)[1]

The obvious desire here is data |> d -> @view gradient(x -> fun(x,y,z), d)[1]. Perhaps the same rule should apply to :macrocall as to :ref -- single underscores recurse inwards, but double underscores do not. But you can also use macros with round brackets, and perhaps there are cases where you would expect something else?

c42f commented

Yes, macros are a really hard case. Ideally we'd have a CST underlying the AST so we could just detect whether round brackets were used! (I believe this will eventually be fixed by a large project to rewrite the Julia frontend, but that project is currently on hold.)

A similar problem affects operators: what should the following two mean?

@_ f(_, b) ⊗ a
@_ ⊗(f(_,b), a)

I think a syntactic rule which is specialized to round brackets makes the most sense, but at the moment we'll have to use crude heuristics to guess whether something was likely to be called with round brackets. We could say the following usually occur without brackets and are not counted as calls:

  • Operators
  • Macros?
  • Other syntax?

Macros are extra hard because of the necessity to add brackets in some cases to circumvent space sensitive parsing. In particular, a |> @a b |> c parses as a |> @a(b |> c), rather than a |> @a(b) |> c. So people would need to add brackets here, unrelated to the fact that they'd like to use _.

Another bit of macro awkwardness is that outer(@_ inner(f(_, 2), g(_,3)), 4; kw=5) needs brackets in order not to digest subsequent arguments like 4 (and kw if you don't use a ;).

I can't think of any macros which would plausibly want a function as one of their arguments, but for __ I guess things like expr |> ex -> @capture(ex, A_[i__]) |> z could happen. That would still fit with my suggestion above. I agree that a rule with exceptions for macros is messier than a rule which goes by round brackets, but at least both @ and [ ] aren't normal symbols.

For infix operators, again I can't think of any which want a function (_), but would be tempted to make your examples errors -- recursing inwards to f(identity, b) would be confusing. But it seems fine that @_ ones(3) |> __ * 10 |> println works.

c42f commented

Another bit of macro awkwardness is that outer(@_ inner(f(_, 2), g(_,3)), 4; kw=5) needs brackets in order not to digest subsequent arguments like 4 (and kw if you don't use a ;).

Yes, totally agreed. This is one of the reasons that having @_ outside the receiving function usually works well. A long time ago @MikeInnes suggested a ' marker on the inner call as syntax for the boundary. I'm a little ambivalent about abusing the adjoint syntax for this, though as a suffix operator it has exactly the right parsing behavior and outside linear algebra it's wasted syntax. Furthermore, we could pattern match only on the specific pattern Expr(:call, Expr(Symbol("'"), ...), ...) to get:

@_ outer(inner'(f(_, 2), g(_,3)), 4; kw=5)

For infix operators, again I can't think of any which want a function (_)

Agreed, I think this is a fairly safe rule.

but would be tempted to make your examples errors -- recursing inwards to f(identity, b) would be confusing.

I'd prefer not to make it an error because there's valid use cases, for example @_ g(_, _^2, x) — that is, functions taking multiple function arguments, one of which might be the identity. The examples were perhaps too trivial; I don't mean to suggest that people would use a single _ - it's just the simplest example of other expressions like

@_ f(g(h(_)), b) ⊗ a
@_ ⊗(f(g(h(_)),b), a)

That's interesting, I remember this ' suggestion but not combined with @_. Would this be a boundary for __ too?

@_ outer(inner'(f(_, 2),  __), 4; kw=5) # outer(z -> inner(x->f(x,2), z), 4; kw=5)

(This outer/inner example was something I was actually doing, not just idly inventing edge cases! But for now I gave my innermost f, g curried methods.)

About infix operators, I meant making anything with ⊗(...) outermost an error. But perhaps this is too strong, perhaps (as I commented on #6 before I saw this) making _ recurse past them is a better idea? Regard the infix form as canonical:

julia> :( @_ ⊗(f(g(h(_)),b), a) )
:(#= REPL[108]:1 =# @_ f(g(h(_)), b) ⊗ a)

Then the rule for _ is more like "outermost ordinary, prefix, round-bracket function call"?

julia> :( @inbounds(A[1:2,:]) )
:(#= REPL[114]:1 =# @inbounds A[1:2, :])
c42f commented

Then the rule for _ is more like "outermost ordinary, prefix, round-bracket function call"?

Yes I think so.

I think this issue is technically closed by #10 so I opened #12 to discuss the general rule.