Unable to parse grouped type as a type

Question

Unable to parse grouped type as a type

Opened this issue 6 years ago · 10 comments

It seems the parser’s type rule can no longer parse direct grouped types. I’m getting a test failure in ceylon.ast, which can be reproduced with this shorter code:

import ceylon.ast.redhat {
    parseType
}

shared void run() {
    print(parseType("<String>"));
}

parseType is just a fairly simple wrapper function:

shared Type? parseType(String code, Anything(JNode, Node) update = noop) {
    if (exists jType = createParser(code).type()) {
        return typeToCeylon(jType, update);
    } else {
        return null;
    }
}

And typeToCeylon can never return null, so it must be that the parser returns null for this code, perhaps because something goes wrong with the lookahead here:

type returns [StaticType type]
    @init { TypeConstructor ct=null; }
    : (typeParameters (TYPE_CONSTRAINT|COMPUTE)) =>
      typeParameters
      { ct = new TypeConstructor(null);
        ct.setTypeParameterList($typeParameters.typeParameterList);
        $type = ct; }
      (
        anonymousTypeConstraints
        { ct.setTypeConstraintList($anonymousTypeConstraints.typeConstraintList); }
      )?
      COMPUTE
      entryType
      { ct.setType($entryType.type); }
    | entryType
      { $type=$entryType.type; }
    ;

Using the entryType or unionType rules, on the other hand, correctly parses the type.

git bisect points to 5c80f1d as the culprit.

Answer 1 · 2018-09-16T20:06:46.000Z

git bisect points to 5c80f1d as the culprit.

Specifically, just this part of the Ceylon.g diff is enough to produce the problem. (Which is a part not directly related to the introduction of the new pipeline operators, as far as I can tell.)

Answer 2 · 2018-09-16T21:36:41.000Z

Wild guess: is the parser a) not being asked to match a final eof, and b) in this particular scenario just being too lazy to bother to parse your entire document?

Answer 3 · 2018-09-17T06:51:24.000Z

Specifically, just this part of the Ceylon.g diff is enough to produce the problem.

Weird. That bit of the grammar doesn't deal with type expressions at all.

Answer 4 · 2018-09-17T06:53:36.000Z

Wild guess: is the parser a) not being asked to match a final eof, and b) in this particular scenario just being too lazy to bother to parse your entire document?

But if @lucaswerkmeister calls the type() rule directly, I don't see why it would want to look for an EOF.

Answer 5 · 2018-09-17T15:17:29.000Z

I don't see why it would want to look for an EOF.

Right. My wild guess is that it isn't looking for one, but that a rule TypeWithEof might work.

Answer 6 · 2018-09-17T20:57:04.000Z

Well, I already tweak the code for some other grammatical constructs.

shared Condition? parseCondition(String code, Anything(JNode, Node) update = noop) {
    if (exists jCondition = createParser(code + ",").condition()) {
        // the parser needs that comma for some conditions                                                                                                                                                                                                                                                                                                 
        return conditionToCeylon(jCondition, update);

shared RequiredParameter? parseRequiredParameter(String code, Anything(JNode, Node) update = noop) {
    if (exists jParameterDeclarationOrRef = createParser(code + ",").parameterDeclarationOrRef()) {
        /*                                                                                                                                                                                                                                                                                                                                                 
         The parser does some lookahead and seems to need that comma to parse a parameterRef                                                                                                                                                                                                                                                               
         */

I guess I’ll try that here as well.

Answer 7 · 2018-09-17T21:00:45.000Z

Yep, adding a comma is enough to make the test pass. And since I already do that in several other places, I guess there’s no real justification for considering this one a bug, except that it happens to be a regression for some bizarre, presumably ANTLR-internal reason (as @gavinking notes, the grammar changes appear unrelated).

Feel free to close the issue unless you want to investigate further…

Answer 8 · 2018-09-18T09:15:44.000Z

@lucaswerkmeister are you sure you're forcing ANTLR to fully-read the token stream?

Answer 9 · 2018-09-18T19:43:51.000Z

I’m not familiar enough with ANTLR to be sure of anything :) but here’s my createParser function:

"Creates a [[Ceylon parser|CeylonParser]] that operates on the given [[code]]."
shared CeylonParser createParser(String code)
        => CeylonParser(CommonTokenStream(CeylonLexer(ANTLRStringStream(code + " ")))); // the lexer seems to need the trailing space

Do I need to do anything special to close the token stream at the end or something?

Answer 10 · 2018-09-21T06:57:03.000Z

Do I need to do anything special to close the token stream at the end or something?

Hrmph, apparently not. It's strange, I seem to remember something about having to force materialization of the whole token stream, but I don't see anything special in my code.