Parse error should close layout block
andreasabel opened this issue · 0 comments
BNFC's layout handling does not implement the following clause, taken from the Haskell 98 report:
A close brace is also inserted whenever the syntactic category containing the layout list ends; that is, if an illegal lexeme is encountered at a point where a close brace would be legal, a close brace is inserted.
Consider this small (artificial) expression grammar with a sum
construct that can use layout.
ETimes. Exp ::= Exp "*" Exp1;
ESum. Exp1 ::= "sum" "{" [Exp] "}";
EInt. Exp1 ::= Integer;
_. Exp ::= Exp1;
_. Exp1 ::= "begin" Exp "end";
separator Exp ";";
layout "sum";
As BNFC has a workaround for parentheses "(...)", we use "begin ... end" here instead.
This grammar handles e.g. sum { 1; 2; sum { 3;4;5 } * 6 } * 7
. It fails on:
begin sum
begin 1 end
2 end * 3
The correct reconstruction of the block would be:
begin sum
{ begin 1 end
; 2 } end * 3
However, he token stream generated by the layout pass does not respect the bracketing begin ... end
:
1:01 "begin"
1:07 "sum"
1:11 "{"
2:03 "begin"
2:09 "1"
2:11 "end"
2:15 ";"
3:03 "2"
3:05 "end"
3:09 "*"
3:11 "3"
3:13 "}"
This is because the closing brace }
is inserted mechanically according to the off-side rule, yet it should be inserted more dynamically by the parser to fix the parse error generated by the end
token in line 3. Basically, the closing bracket is not inserted by dedentation but also by parse errors.
A layout stop "end"
instruction does not help here, as we then close the layout block too early, before the first end
, rather than the second end
:
1:01 "begin"
1:07 "sum"
1:11 "{"
2:03 "begin"
2:09 "1"
2:11 "}"
2:11 "end"
3:03 "2"
3:05 "end"
3:09 "*"
3:11 "3"