Insufficient parenthesizing
akimd opened this issue ยท 13 comments
This issue is very very close to #131, and it is very serious problem in my production (it turns out, contrary to what I said in #131, that we do depend on unparser on such code).
The test case is very similar, but different: return true ? 1 : 2
is pretty-printed as
return if true
1
else
2
end
which is again wrong, since in that case return if true
is a postfix condition.
$ unparser -e 'return true ? 1 : 2' --verbose
warning: parser/current is loading parser/ruby27, which recognizes
warning: 2.7.2-compliant syntax, but you are running 2.7.1.
warning: please see https://github.com/whitequark/parser#compatibility-with-ruby-mri.
(string):3:1: error: unexpected token kELSE
(string):3: else
(string):3: ^~~~
(string)
Original-Source:
return true ? 1 : 2
Generated-Source:
return if true
1
else
2
end
Original-Node:
(return
(if
(true)
(int 1)
(int 2)))
Generated-Node:
#<Parser::SyntaxError: unexpected token kELSE>
/opt/local/lib/ruby2.7/gems/2.7.0/gems/parser-2.7.2.0/lib/parser/diagnostic/engine.rb:72:in `process'
/opt/local/lib/ruby2.7/gems/2.7.0/gems/parser-2.7.2.0/lib/parser/base.rb:285:in `on_error'
(eval):3:in `_racc_do_parse_c'
(eval):3:in `do_parse'
/opt/local/lib/ruby2.7/gems/2.7.0/gems/parser-2.7.2.0/lib/parser/base.rb:189:in `parse'
/opt/local/lib/ruby2.7/gems/2.7.0/gems/unparser-0.5.3/lib/unparser.rb:98:in `block in parse_either'
/opt/local/lib/ruby2.7/gems/2.7.0/gems/mprelude-0.1.0/lib/mprelude.rb:85:in `wrap_error'
/opt/local/lib/ruby2.7/gems/2.7.0/gems/unparser-0.5.3/lib/unparser.rb:97:in `parse_either'
/opt/local/lib/ruby2.7/gems/2.7.0/gems/mprelude-0.1.0/lib/mprelude.rb:168:in `bind'
/opt/local/lib/ruby2.7/gems/2.7.0/gems/unparser-0.5.3/lib/unparser/validation.rb:63:in `from_string'
/opt/local/lib/ruby2.7/gems/2.7.0/gems/unparser-0.5.3/lib/unparser/cli.rb:40:in `validation'
/opt/local/lib/ruby2.7/gems/2.7.0/gems/unparser-0.5.3/lib/unparser/cli.rb:143:in `public_send'
/opt/local/lib/ruby2.7/gems/2.7.0/gems/unparser-0.5.3/lib/unparser/cli.rb:143:in `process_target'
/opt/local/lib/ruby2.7/gems/2.7.0/gems/unparser-0.5.3/lib/unparser/cli.rb:133:in `block in exit_status'
/opt/local/lib/ruby2.7/gems/2.7.0/gems/unparser-0.5.3/lib/unparser/cli.rb:132:in `each'
/opt/local/lib/ruby2.7/gems/2.7.0/gems/unparser-0.5.3/lib/unparser/cli.rb:132:in `exit_status'
/opt/local/lib/ruby2.7/gems/2.7.0/gems/unparser-0.5.3/lib/unparser/cli.rb:64:in `run'
/opt/local/lib/ruby2.7/gems/2.7.0/gems/unparser-0.5.3/bin/unparser:10:in `<top (required)>'
/opt/local/bin/unparser:23:in `load'
/opt/local/bin/unparser:23:in `<main>'
Error: (string)
Cheers!
@akimd Valid issue, given you ran it through the unparser self check that fails. Also interesting neither rubyspec nor parsers test suite hit this construct.
This means I need an even wider ruby corpus to test against. Can you recommend a public one? Or can you contact me in private (if you happen to have a large code base you may be able to share for this propose).
Its likely that unparser actually has to learn to emit these ifs ifs as ternary, because only they can produce that exact AST, given its an argument to return
. Adding parenthesis may produce a begin
node which would invalidate the "exact" AST guarantee.
Unparsers internal changes for 2.7 support (they where more a re-write) make it more easy to direct the code emitting to specific forms.
I cannot guarantee I can get to this issue very soon, while mutant is primarily funding this OSS project, it does not feed me well enough at this point, I've to priorize my consulting work.
There are some possible workarounds I'm happy to discuss (also contact me in private) on this issue, it depends on the actual use case.
Hi!
Thanks for the fast answer.
This means I need an even wider ruby corpus to test against. Can you recommend a public one?
I'm afraid I am not well versed enough for this. In my case we run source to source transformation on the AST and pretty-print it at the end.
Or can you contact me in private (if you happen to have a large code base you may be able to share for this propose).
I'm sorry I have nothing to offer here :( And I do understand your concern. What I can do is submit such test cases when we encounter them. Might not protect from lurking issues, but should protect from regressions.
Its likely that unparser actually has to learn to emit these ifs ifs as ternary, because only they can produce that exact AST, given its an argument to
return
. Adding parenthesis may produce abegin
node which would invalidate the "exact" AST guarantee.
Bummer... That's a dirty catch 22. Never though about that constraint.
I cannot guarantee I can get to this issue very soon, while mutant is primarily funding this OSS project, it does not feed me well enough at this point, I've to priorize my consulting work.
Ok. We have reverted to older versions of parser/unparser that did not exhibit this issue.
There are some possible workarounds I'm happy to discuss (also contact me in private) on this issue, it depends on the actual use case.
I can see that forcing the parens would help (return (1 ? 2 : 3)
). I'll see our team if this is applicable to us.
Cheers!
We run source to source transformation on the AST and pretty-print it at the end.
2 things here:
- Its not really pretty printing, its unparsing. Unparsers output is "far from pretty" but could be made so (I've no reason to spend time on that metric ATM). Better not call it pretty printing if it can create so ugly source ;)
- Its important to realize that unparser "only" has the guarantee: It produces source that if parsed via parser produces the SAME ast if parsed again.
The latter has an important consequence: If you transform AST, make sure that you know what parser would produce.
Example:
ruby-parse -e 'foo while foo = bar'
(while
(lvasgn :foo
(send nil :bar))
(send nil :foo))
the foo
body of the while is a send.
ruby-parse -e 'foo = 1; foo while foo = bar'
(begin
(lvasgn :foo
(int 1))
(while
(lvasgn :foo
(send nil :bar))
(lvar :foo)))
The foo
of the while body is an lvar
read!
This is because the local variable scope is different. If you are doing AST transforms, make sure the AST you generate expresses the semantics you desire and captures the effects like the lvar scope changes.
Also its important to keep in mind there is a wide range of AST you may generate, that parser
would never return from parsing. Unparser will, in the current versions: NOT prevent you from unparsing the into (potentially) valid source, that fails the guarantee.
In the next version of unparser, I plan a new API that would unparse than parse again, to make sure it filters out these cases where artificially constructed AST is passed that "cannot" be generated by parser.
For some private tooling that does source transform (smelling like the ones you indirectly describe) I use such APIs already to prevent this case.
- Its not really pretty printing, its unparsing. Unparsers output is "far from pretty" but could be made so (I've no reason to spend time on that metric ATM). Better not call it pretty printing if it can create so ugly source ;)
I see very well what you mean here. Indeed, unparser
does not have any support of boxing, or managing with etc., so it is rather a crude pretty-printer. Yet, it indents. That's were I usually cut the line. The intended reader is clearly a human, if it were solely for the sake of the machine, it could be much more terse.
- Its important to realize that unparser "only" has the guarantee: It produces source that if parsed via parser produces the SAME ast if parsed again.
Yup, I'm also well aware of this. And I do understand that shaking the AST does not guarantee in anyway that it is still valid.
But we pay attention to that. Besides, our transformations are mostly related to sanity-checks, there's nothing really deep, AFAIK.
the
foo
body of the while is a send.
[...]
Thefoo
of the while body is anlvar
read!
Yup, already found about that, thanks :)
Cheers!
So yes, (return (if ...))
can only be returned from parser on a return with a ternary expression. Unparser has to learn about this and emit the if as ternary.
bundle exec unparser -e 'return(true ? 1 : 2)' --verbose
mutant-dev@mbj ~/devel/unparser (fix/return-parens) $ bundle exec unparser -e 'return(true ? 1 : 2)' --verbose
(string)
Original-Source:
return(true ? 1 : 2)
Generated-Source:
return (if true
1
else
2
end)
Original-Node:
(return
(begin
(if
(true)
(int 1)
(int 2))))
Generated-Node:
(return
(begin
(if
(true)
(int 1)
(int 2))))
Success: (string)
This is a different AST, hence unparer cannot emit parens to unparse return true ? 1 : 2
.
Also affected:
next
break
Also note this one:
bundle exec unparser -e 'return 1, true ? 1 : 2' --verbose
(string)
Original-Source:
return 1, true ? 1 : 2
Generated-Source:
return 1, if true
1
else
2
end
Original-Node:
(return
(int 1)
(if
(true)
(int 1)
(int 2)))
Generated-Node:
(return
(int 1)
(if
(true)
(int 1)
(int 2)))
Success: (string)
So "only" if its a single argument it needs to be ternary. Love you Ruby.
You have been fast, thanks a lot @mbj!
We will try it today, and report our mileage.
Many thanks for your time.
You have been fast, thanks a lot @mbj!
Only possible due the huge amount of time invested on refactorings before. Making a better test suite, CI setup, internal code structure etc.
We will try it today, and report our mileage.
Ping. I want to get mentally pas this one ;)
I have just been confirmed that with the bug fix, our problem was solved. Thanks!
Thanks, I love closing open mental loops. Cheers.