eclipse-archived/ceylon

Unable to parse grouped type as a type

Opened this issue · 10 comments

It seems the parser’s type rule can no longer parse direct grouped types. I’m getting a test failure in ceylon.ast, which can be reproduced with this shorter code:

import ceylon.ast.redhat {
    parseType
}

shared void run() {
    print(parseType("<String>"));
}

parseType is just a fairly simple wrapper function:

shared Type? parseType(String code, Anything(JNode, Node) update = noop) {
    if (exists jType = createParser(code).type()) {
        return typeToCeylon(jType, update);
    } else {
        return null;
    }
}

And typeToCeylon can never return null, so it must be that the parser returns null for this code, perhaps because something goes wrong with the lookahead here:

type returns [StaticType type]
    @init { TypeConstructor ct=null; }
    : (typeParameters (TYPE_CONSTRAINT|COMPUTE)) =>
      typeParameters
      { ct = new TypeConstructor(null);
        ct.setTypeParameterList($typeParameters.typeParameterList);
        $type = ct; }
      (
        anonymousTypeConstraints
        { ct.setTypeConstraintList($anonymousTypeConstraints.typeConstraintList); }
      )?
      COMPUTE
      entryType
      { ct.setType($entryType.type); }
    | entryType
      { $type=$entryType.type; }
    ;

Using the entryType or unionType rules, on the other hand, correctly parses the type.

git bisect points to 5c80f1d as the culprit.

git bisect points to 5c80f1d as the culprit.

Specifically, just this part of the Ceylon.g diff is enough to produce the problem. (Which is a part not directly related to the introduction of the new pipeline operators, as far as I can tell.)

Wild guess: is the parser a) not being asked to match a final eof, and b) in this particular scenario just being too lazy to bother to parse your entire document?

Specifically, just this part of the Ceylon.g diff is enough to produce the problem.

Weird. That bit of the grammar doesn't deal with type expressions at all.

Wild guess: is the parser a) not being asked to match a final eof, and b) in this particular scenario just being too lazy to bother to parse your entire document?

But if @lucaswerkmeister calls the type() rule directly, I don't see why it would want to look for an EOF.

I don't see why it would want to look for an EOF.

Right. My wild guess is that it isn't looking for one, but that a rule TypeWithEof might work.

Well, I already tweak the code for some other grammatical constructs.

shared Condition? parseCondition(String code, Anything(JNode, Node) update = noop) {
    if (exists jCondition = createParser(code + ",").condition()) {
        // the parser needs that comma for some conditions                                                                                                                                                                                                                                                                                                 
        return conditionToCeylon(jCondition, update);
shared RequiredParameter? parseRequiredParameter(String code, Anything(JNode, Node) update = noop) {
    if (exists jParameterDeclarationOrRef = createParser(code + ",").parameterDeclarationOrRef()) {
        /*                                                                                                                                                                                                                                                                                                                                                 
         The parser does some lookahead and seems to need that comma to parse a parameterRef                                                                                                                                                                                                                                                               
         */

I guess I’ll try that here as well.

Yep, adding a comma is enough to make the test pass. And since I already do that in several other places, I guess there’s no real justification for considering this one a bug, except that it happens to be a regression for some bizarre, presumably ANTLR-internal reason (as @gavinking notes, the grammar changes appear unrelated).

Feel free to close the issue unless you want to investigate further…

@lucaswerkmeister are you sure you're forcing ANTLR to fully-read the token stream?

I’m not familiar enough with ANTLR to be sure of anything :) but here’s my createParser function:

"Creates a [[Ceylon parser|CeylonParser]] that operates on the given [[code]]."
shared CeylonParser createParser(String code)
        => CeylonParser(CommonTokenStream(CeylonLexer(ANTLRStringStream(code + " ")))); // the lexer seems to need the trailing space                                                                                                                                                                                                                      

Do I need to do anything special to close the token stream at the end or something?

Do I need to do anything special to close the token stream at the end or something?

Hrmph, apparently not. It's strange, I seem to remember something about having to force materialization of the whole token stream, but I don't see anything special in my code.