gallais/agdarsec

Improving error messages

langston-barrett opened this issue Β· 5 comments

Right now, every parser failure returns the same value: 𝕄.βˆ…. A library consumer can't keep track of e.g. which parser is running, where in the source the error was, or what caused the error.

I'm not sure what the best way to fix this is. parsec has a type ParseError and their version of runParser returns the equivalent of aParseError ⊎ Success. agdarsec could do something very similar. More generally, Parser could have another two parameters, say

  • {ParseError : Set e}, and
  • mkError : (m : β„•) β†’ Toks m β†’ ParseError

where mkError m s would be used wherever 𝕄.βˆ… is now (m being the position in the stream).
This would allow users to ignore errors for performance reasons. (Of course, this specializes to the current behavior when ParseError ≑ 𝕄 and mkError ≑ Ξ» _ _ β†’ 𝕄.βˆ….)

In any case, this would require fundamentally changing the API, though I think it's necessary for any serious use.

In any case, this would require fundamentally changing the API

I'm not sure it is the case. The API is parametrised over M which can be
any alternative monad. It should be possible to pick an M which keeps
track of the position of the current token and reports its location upon failure.

You could put a State transformer on top of Either Loc for instance, get
the corresponding view function to update the location and grab it when
calling 𝕄.βˆ….

I'll try to see if I can come up with a simple example tomorrow.

I wondered about something similar, but figured that since the combinators require their sub-parsers to be in the same monad, the value of 𝕄.βˆ… would have to be identical between them (i.e. the sub-parsers couldn't return different source locations or their own names as part of 𝕄.βˆ…). I would be excited to learn that this is not the case!

I now have an "intrumented" version of agdarsec on the parameters branch.
The details of the new parser are explained in the (documented!) modules:

So far I've only ported the examples by using a non-instrumented version of
the parser (i.e. the Instruments picked does not do anything interesting) but
you may want to have a look at the design.

I'm quite happy with the fact that the types in Text.Parser.Combinators are
left virtually unchanged. The only changes are:

  • all the parameters are now in a single record of type Parameters
  • there is an extra instance argument {{Instrumented P}} (used in anyTok)

(ping @clayrat)

It might be good to put a link/citation in the documentation for those not familiar with "instrumented" monads (myself πŸ˜„). Thanks for looking into this! I look forward to testing it out soon.

I made the term up based on instrumentation and have no idea if instrumented monads are "a thingΒ©".