mrkkrp/megaparsec

Tabs are not handled correctly when errors are rendered

aartamonau opened this issue · 4 comments

The issue is similar to #239. While the example in that issue works fine, the one below still produces incorrect results:

$ stack ghci --package megaparsec-9.3.0 --resolver lts-20.21
ghci> import Text.Megaparsec
ghci> import Text.Megaparsec.Char
ghci> import Data.Void
ghci> parseTest (string' "test" *> char '\t' *> char 'a' :: Parsec Void String Char) "test\tb"
1:9:
  |
1 | test        b
  |         ^
unexpected 'b'
expecting 'a'

I think the issue might be in this line:

(SourcePos n l (mkPos $ c' + w - ((c' - 1) `rem` w)))

It aligns the position to the nearest column divisible by tab width (or something like that). This code predates expandTab which just replaces all tabs with tab width spaces here:

( Just $ case expandTab pstateTabWidth

So if I apply this diff:

                       id
                 | ch == tabTok ->
                     St
-                      (SourcePos n l (mkPos $ c' + w - ((c' - 1) `rem` w)))
+                      (SourcePos n l (mkPos $ c' + w))
                       (g . (fromTok ch :))
                 | otherwise ->
                     St

then the example in my original post works:

ghci> parseTest (string' "test" *> char '\t' *> char 'a' :: Parsec Void String Char) "test\tb"
1:13:
  |
1 | test        b
  |             ^
unexpected 'b'
expecting 'a'
mrkkrp commented

This is certainly a bug, although I think the position should indeed be 1:9, so the current calculation of SourcePos is correct. The problem is that expandTab should be more intelligent and insert only as many spaces as is necessary to reach the next tab stop.

Thank you for such a quick turnaround @mrkkrp!

mrkkrp commented

@aartamonau Thanks for spotting it! I've just published version 9.3.1 with the fix.