mrkkrp/megaparsec

How to make megaparsec aware of Source Positions when using a custom token stream

solomon-b opened this issue · 2 comments

I have a lexer I would like to use with Megaparsec:

data Token =
    StringLit Text
    -- ^ String Literal
  | Identifier Text
    -- ^ Identifier
  | NumLit Scientific
    -- ^ Number literal
  | BoolLit Bool
  | Bling
  | Colon
  | Dot
  | Comma
  ...

data TokenExt = TokenExt { teType :: Token, tePos :: SourcePos }
  deriving (Show, Eq, Ord)

lexer :: Text -> [TokenExt]

The lexer constructs a SourcePos for each Token. while throwing away all the whitespace.

I've written a Stream instance and can parse successfully but Megaparsec isn't aware of the source positions in my TokenExt type and errors all report line 1, column 1.

Do I need a TraversableStream? The description sounds like what I need, but reachOffset doesn't appear to have access to the token stream.

Have you read the tutorial? In particular, this chapter: https://markkarpov.com/tutorial/megaparsec.html#working-with-custom-input-streams? reachOffset has access to the token stream, it is in PosState s in the field pstateInput. There is an example in the tutorial which I hope will be helpful.

Oh wow I completely missed that article. Thank you and sorry about that!