whitequark/parser

Void value expressions

whitequark opened this issue · 24 comments

Certain expressions cannot be passed as arguments. Investigate: JoshCheek/seeing_is_believing#10 (comment)

An example of code which is rejected by MRI parser:

1.times do
  puts(if true
    break
  end)
end

This code is accepted, though, despite having completely identical semantics:

1.times do
  puts(if true
    break
    1
  end)
end

I find this bit of static analysis in parse.y useless and obnoxious. However, not implementing it means that Parser will accept some invalid Ruby code (as what's valid and what is not is defined by parse.y behavior.)

Your opinion? @mbj @yorickpeterse @judofyr @bbatsov

I think it's fine to accept invalid code in these edge cases at the moment.
Most people will use parser to parse valid code anyway.

On Monday, June 17, 2013, Peter Zotov wrote:

An example of code which is rejected by MRI parser:

1.times do
puts(if true
break
end)
end

This code is accepted, though, despite having completely identical
semantics:

1.times do
puts(if true
break
1
end)
end

I find this bit of static analysis in parse.y useless and obnoxious.
However, not implementing it means that Parser will accept some invalid
Ruby code (as what's valid and what is not is defined by parse.y behavior.)

Your opinion? @mbj https://github.com/mbj @YorickPetersehttps://github.com/YorickPeterse
@judofyr https://github.com/judofyr @bbatsovhttps://github.com/bbatsov


Reply to this email directly or view it on GitHubhttps://github.com//issues/72#issuecomment-19523741
.

// Magnus Holm

I agree with @whitequark that such analysis is pointless. I'm ok if Parser does not behave as MRI's parser in such scenarios.

👍 @bbatsov @judofyr I'd also count this as a low-priority thing to be fixed later, possibly much later.

mbj commented

@whitequark I'd accept this edge case. IMHOs rubies definition of "valid" is just to unsharp here.

I personally also agree that the statistical analysis is not required on parser level, I'd even argue that that would be wrong.

The problem is that we can't claim parsing Ruby unless the behavior is identical to parse.y's... this is also a huge problem for Rubinius, for example. They have to reimplement every single idiosyncratic hack which MRI has.

mbj commented

IMHO: It does not drive ruby forward to have only a single source of idiosyncratic hacks that get copied to all other implementations. @whitequark, so just claim "parser parses ruby that makes sense". There is no need to carry bugs around.

The problem with "parser parses ruby that makes sense" is that it's basically "parses ruby which @whitequark likes". I.e. not ruby but something else, perhaps a subset thereof.

Unfortunately, driving ruby forward must be done with MRI being the first, at least in the core language, or so it seems.

Oh and I claim to support earlier versions of ruby, so I'm not free even if they drop it in 3.0 or whatever version.

That being said, I'm completely unexcited by the perspective of implementing and testing this.

mbj commented

@whitequark I understand your concerns. Lets hope this gets dropped in 3.0. Or better 2.1. Isnt a subset of ruby enough? The subset that is actually in use. I don't think someone using such void value semantics is in production.

No alternative ruby implementation I know supports all MRI behavior, and this is a good thing. I'd opt to raise on such edge cases and fix them if someone can come up with a real world scenario where this behavior is needed?

@mbj All other Ruby implementations use MRI's parser, though.

I see your point. The problem is that "raising on edge cases" will pretty much resolve this issue, as this is what MRI's parser indeed does ;) And it's some strange and unspecified behavior.

mbj commented

To be egoistic: I'm fine if parser parses the subset of ruby I'm use.

To understand the point: MRI raises where it should not?

Could we add dedicate linter that does the job?

@mbj Yes, like that, if by "should not" you mean "should not by common sense".

Yes; but writing such a linter is equivalent to just resolving this issue. Which I have no idea how to do, because this behavior is intricate and unspecified, and I just don't know a better way to write testcases than to throw various code at MRI and look how it reacts. (The part in parse.y which handles this is... erm... not very clear.)

mbj commented

@whitequark TBH I more and more feel: Ignore it till a real world bug report strikes in. Why port unspecified behavior from MRI? - This statement sounds lame from a be correct first POV. But the problem is to define correct here.

You can always warn in the README about this scenario.

@mbj Well, most of parse.y's behavior is unspecified (I think that the grammar in the ISO standard is not enough to replicate parse.y... may be very wrong though).

Yes, I think that warning in README is the way to go. Thanks!

If we want something like this to be linted I could fairly easily slap this in ruby-lint. Then again I think this use case is so rare that it might not be worth the extra effort.

@yorickpeterse It's not something you can meaningfully execute, so why lint against it?

It's something that won't trigger an error until you actually run it. When you run ruby with -c -W2 it will warn you about this, but I'm not sure how many people would have this hooked up to their editor.

Also worth mentioning is that the errors Ruby in this case spits out aren't very user friendly:

/tmp/test.rb:4: warning: (...) interpreted as grouped expression
/tmp/test.rb:4: void value expression

Yep, the messages MRI produces are often not particularly helpful and more importantly - MRI's linting is not always consistent. That's some of the reasons that made us do linting ourselves in RuboCop.

@yorickpeterse Um, no? Try this:

0.times do
  puts(if true
    break
  end)
end

@whitequark I'm not sure how that would be any different to analyse. In pseudo-ruby-lint code this would look something along the lines of the following:

class LolRuby
  def initialize
    @buffer_node_types = false
    @node_types = []
  end

  def on_send(node)
    @buffer_node_types = true
  end

  def after_send(node)
    @buffer_node_types = false
  end

  def after_if(node)
    if @buffer_node_types and @node_types == [:break]
      # add an error here...
    end

    @node_types.clear
  end

  # Currently ruby-lint doesn't have a catch-all callback method so I'd have to
  # implement that (which is trivial to do).
  def on_node(node)
    @node_types << node.type if @buffer_nodes
  end
end

Either way, I don't think this would belong in parser and as stated by others I'm fine with deviating from the Ruby "standard" here.

Here are a few more examples of void value expressions https://gist.github.com/JoshCheek/5625007