mrkkrp/megaparsec

Custom provenance/metadata

Opened this issue · 4 comments

I was wondering if there was a way to have custom provenance information or additional metadata on tokens that you can easily get access to. The spans I use for provenance are source offsets (and I'm parsing on a token stream after a lexing pass). I include these spans in the token stream but I don't have a nice way to get them out without altering every combinator. Before, when I was parsing on text, I had a basic wrapper using the builtin getOffset:

withSpan :: Parser a -> Parser (Spanned a)
withSpan p = do
  startPos <- getOffset
  result <- p
  Spanned result . SrcLoc startPos <$> getOffset

But now I can't do this because offsets are for tokens not source. I tried getting the parser state, but the issue is, even though I can get spans for tokens from the state, once I reach end of input I'm not sure how to get the span info from the token because the stream is empty.

One solution could be to define a custom input stream and then define reachOffset and reachOffsetNoLine for it, so that after the last token has been parsed the source position is set to the end of that token. Once that is done, getSourcePos could be used similar to getOffset, but of course it will return the line and column, not offsets in the original input stream.

Would it be possible to parameterize the Parsec and ParsecT types over a metadata type that user could specify but by default would include the ordinary SourcePos?

To be honest I am reluctant to index the ParsecT type with even more type variables.

Yea it is quite an important type to change. This might be something I could just do with a fork.