Wrong source locations on `unexpected end of input` with custom tokens
sol opened this issue · 2 comments
Given the example code at https://markkarpov.com/tutorial/megaparsec.html#working-with-custom-input-streams, if I modify exampleStream
to
exampleStream :: MyStream
exampleStream = MyStream
"5 + 6"
[ at 1 1 (Int 5)
, at 1 3 Plus -- (1)
]
where
at l c = WithPos (at' l c) (at' l (c + 1)) 2
at' l c = SourcePos "" (mkPos l) (mkPos c)
then I get a wrong source location in the error message:
ghci> parseTest (pSum <* eof) exampleStream
1:1:
|
1 | 5 +
| ^
unexpected end of input
expecting integer
I haven't investigated any further, so not sure if it's an issue with the instance definitions from the tutorial or an issue with megaparsec
itself. From what I tried, it seems to work fine with character based parsers.
Thanks for spotting this! The problem is in the definition of the reachOffset
method from TraversableStream MyStream
. When after splitting the stream there are tokens following the location of the error it should not default to the previous position (here, 1:1), but perhaps it should instead assume the position of the end of the span of the last consumed token, e.g.:
@@ -20,7 +20,9 @@ instance TraversableStream MyStream where
sameLine = sourceLine newSourcePos == sourceLine pstateSourcePos
newSourcePos =
case post of
- [] -> pstateSourcePos
+ [] -> case unMyStream pstateInput of
+ [] -> pstateSourcePos
+ xs -> endPos (last xs)
(x:_) -> startPos x
(pre, post) = splitAt (o - pstateOffset) (unMyStream pstateInput)
(preStr, postStr) = splitAt tokensConsumed (myStreamInput pstateInput)
Then, assuming you also change the "original user input" in exampleStream
to match the tokens:
exampleStream :: MyStream
exampleStream = MyStream
"5 +"
[ at 1 1 (Int 5)
, at 1 3 Plus -- (1)
]
where
at l c = WithPos (at' l c) (at' l (c + 1)) 2
at' l c = SourcePos "" (mkPos l) (mkPos c)
...it seems to work:
ghci> parseTest (pSum <* eof) exampleStream
1:4:
|
1 | 5 +
| ^
unexpected end of input
expecting integer
I'm going to push a fix for that tutorial.
This is now fixed.