Struct capture wrongly applying previous captures in a failed branch
petee-d opened this issue · 2 comments
Hey, while working on parser code generation and implementing lookahead error recovery, I noticed a bug. Consider this example:
type BugStructCapturesInBadBranch struct {
Bad bool `parser:"(@'!'"`
A BugFirstAlternative `parser:" @@) |"`
B int `parser:"('!' '#' @Int)"`
}
type BugFirstAlternative struct {
Value string `parser:"'#' @Ident"`
}
func TestBug_GroupCapturesInBadBranch(t *testing.T) {
var out BugStructCapturesInBadBranch
require.NoError(t, MustBuild(&BugStructCapturesInBadBranch{}, UseLookahead(2)).ParseString("", "!#4", &out))
assert.Equal(t, BugStructCapturesInBadBranch{B: 4}, out)
}
I tried to make it as minimalistic as reasonable, it's quite an obscure bug that's unlikely to bother someone but I thought I'd report it anyway. strct.Parse
will call ctx.Apply
even if s.expr.Parse
returned an error. The purpose of that is apparently providing a partial AST in case the entire parsing fails, but is has an unwanted side-effect. Any captures added to parseContext.apply
added by the branch so far will be applied, even though the error may later be caught by a disjunction or a ?/*/+ group and recovered. I think this can only happen if lookahead is at least 2, as it requires one token for that unwanted capture and a second token for the strct
to return an error instead of nil out
.
In the example above, the input is constructed to match the second disjunction alternative, but the first tokens will initially lead it into the first alternative and into the BugFirstAlternative
struct. When attempting to match Ident
for the Value
field, the sequence will fail and return an error, but ctx.apply
will already contain a capture for BugStructCapturesInBadBranch.Bad
, which will be applied in strct.Parse
, even though the disjunction recovers it and matches the second alternative.
I don't think it's super important this is fixed, but my code generation parser's behavior will differ from this, because I'm trying to take a different approach to recovering failed branches - restoring to a backup of the parsed struct when branch fails instead of delaying applying captures.
@alecthomas ping about this. The generated parser will differ from the reflective parser's behavior here and I would like to make sure that the reported behavior is indeed incorrect - definitely seems so.
Definitely a bug. I'm pretty surprised this hasn't come up before TBH.