mbj/unparser

No emitter for node: :__LINE__ and __FILE__

Closed this issue · 7 comments

naw commented

This is related to #3, but now that whitequark/parser#89 is out, actual nodes are (optionally) created for __LINE__ and __FILE__, and unparser doesn't know how to handle them.

mbj commented

Right now unparser expects to use the "default" AST of parser. But I agree we should handle the "meta nodes" also.

naw commented

Using the default AST from parser, __FILE__ is parsed as a string literal that equals either the original source filepath (as passed to parse), or '(source)' if no path is provided.

In neither case does Unparser unparse that string literal back into __FILE__. Is that the expected result?

I see that the original code for #3 was removed here 54d0415 but it wasn't obvious to me if that was replaced with something else that achieved the same results, or if the expected functionality changed?

Thank you!

mbj commented

Using the default AST from parser,__FILE__ is parsed as a string literal that equals either the original source filepath (as passed to parse), or '(source)' if no path is provided.

correct. This is the default behavior of parser.

In neither case does Unparser unparse that string literal back into __FILE__. Is that the expected result?

For the default AST produced by parser yes. Unparsers goal is to produce source that when parsed by parser is evaluating into exactly the same ast when parsed (excluding source maps).

Technically speaking unparser wants to infinitely retain the property of parse(source) = parse(unparse(parse(source))) at arbitrarily deep nesting of parse/unparsing, while using parsers defaults.

I see that the original code for #3 was removed here 54d0415 but it wasn't obvious to me if that was replaced with something else that achieved the same results, or if the expected functionality changed?

It was removed because it was unused in my projects, also it had no specs. My opensource policy states I remove everything that is not used by my own projects to reduce maintenance burden. Unless someone volunteers to maintain this feature at an acceptable quality level.

The expected functionality of unparser changed. Also because parser at the time the removal code was made added support to emit AST for __FILE__ and __LINE__ explicitly via an option. So it was not "default" anymore.

I maybe have to change using parser in non default mode (emitting unexpanded nodes) for my mutant project, which will than create a need for unparser to support it also. But my OSS time is to limited to do this short term.

mbj commented

Also the reason for removal was: Source location aware. Unparser should never use these.

naw commented

Cool, thanks for explaining all of that. No complaints here.

Very new to this project, so my naive understanding was that unparse(parse(source)) ~ source. (i.e. roundtrips yield equivalent (but not necessarily identical) code).

However, as you've pointed out, parse(source) == parse(unparse(parse(source))) is the actual technical goal, which is not quite as stringent as unparse(parse(source)) ~ source. One is roundtrips starting with AST, the other is roundtrips starting with source.

__LINE__ and __FILE__ are obvious blockers for roundtrips with source, which presumably can be solved by running parser with the non-default emit.. option and by Unparser knowing how to emit those meta nodes.

Of course, there may be many other blocks too that aren't as obvious, so I understand now it might be infeasible (or at least unreasonable) to expect Unparser to do that. Again, no complaints from me, just trying to understand. Thanks.

mbj commented

Very new to this project, so my naive understanding was that unparse(parse(source)) ~ source. (i.e. roundtrips yield equivalent (but not necessarily identical) code).

When you say unparse(parse(source)) ~ source note that the ~ operator is effectively telling the "semantics" of the source should be equivalent and as the AST encodes these semantics to remove the ~ operator and use the == operator I added a layer to compare the AST in my example.

__LINE__ and __FILE__ are obvious blockers for roundtrips with source

Not under the ~ operator as I explained above.

Of course, there may be many other blocks too that aren't as obvious, so I understand now it might be infeasible (or at least unreasonable) to expect Unparser to do that. Again, no complaints from me, just trying to understand. Thanks.

I'm open to add support for the non default AST, as I need to do it some day for my other projects. I'd accept a contribution adding emitters for :__LINE__ and :__FILE__. Would not be that complex.

naw commented

I guess __LINE__ and __FILE__ are a little subjective regarding what is "equivalent".

Suppose source is puts __LINE__ if true. Originally, that outputs "1".

However, when parsed and unparsed, the puts gets moved to line 2:

if true
  puts "1"
end

That's fine, but should the unparsed code output "1", or should it output "2"? Which is more equivalent to the intent of the original code?

There isn't an automated way to know the "intent" of the original code. Naively, I would have expected "2" to be more equivalent to the original code, but I realize now that's not necessarily true --- at the very least it's subjective.

Effectively, the configurable parser option allows the user to specify how to interpret the "intent", so I can definitely understand why Unparser behaves as it does for the default output. Thanks for bearing with me as I grapple with this.

I'm open to add support for the non default AST, as I need to do it some day for my other projects. I'd accept a contribution adding emitters for :LINE and :FILE. Would not be that complex.

Great! I'll see what I can do.