Multiple errors

Question

Multiple errors

jamesdbrock opened this issue 3 years ago · 7 comments

Should we have multiple errors like in Megaparsec? https://hackage.haskell.org/package/megaparsec/docs/Text-Megaparsec-Error.html#t:ParseErrorBundle

Answer 1 · 2023-04-04T12:24:48.000Z

I would start this with an "IMO" ("In My Opinion") but I think this is more then an opinion...

Parse errors without context simply do not contain enough information content, especially in recursive parsing where you face arbitrary depth and are not able to represent more and more of the necessary information.

Also the system for bundling, skipping and recovery does not have to be the same; I propose something that lets the user fully handle multiple error parsing that keeps the library simple and easy to use:

-- new combinators at the minimum
recoverWith :: ParserT s m a -> ParserT s m Unit -> ParserT s m (Either ParseError a)
bundleErr :: ParserT s m (Either ParseError a) -> ParserT s m a

-- some potentially easier way to combine errors, fail if any not present, list all errors
-- The binding of (<*>) with the error combination of (<|>)
(<*?>) :: Either [e] (a -> b) -> Either [e] a -> Either [e]

-- type change
data ParseError
    = ParseError String Position
    | ParseContext String Position ParseError -- +
    | ParseBundle String Position [Parse] -- +

This is only going to be breaking for projects that are matching on ParseError, which is going to be some projects however its not going to be a large refactor even in that case.

Answer 2 · 2023-04-04T12:33:53.000Z

I'm all for better error messages. However, don't better errors typically mean slower parsing?

Answer 3 · 2023-04-04T12:45:44.000Z

I'm all for better error messages. However, don't better errors typically mean slower parsing?

Not at all! ParseContext just wraps all errors in a section which is an incredibly cheep action,
bundleErr just raises the error if its present and otherwise succeeds
and recoverWith will only fire the recovery parser on a failure, and as for the recovery parser, these things are usually incredibly simple, aggressively so.

This would not cause any noticeable performance hit. 👍

Answer 4 · 2023-04-04T12:49:32.000Z

No offense but I'd be more convinced by a benchmark than a claim like above.

Answer 5 · 2023-04-04T12:59:29.000Z

I do not know how better to tell you that these operations that do simple manipulations on the ParseError type are trivial from a performance perspective, the only thing that effects performance is recoverWith, however:

It changes the behavior so if that tradeoff exists it can be justified.
It will only fire if the original parser errors.
These parsers are, by design, incredibly simple and that usually means performant. (e.g. next ; at same {} level)

Answer 6 · 2023-04-04T13:37:25.000Z

If you want to try to write a multiple-error feature @Violet-Codes then I would like to try to add that feature to parsing. But I don't want to waste your time so let's try to make sure that what you plan to implement has a good chance of getting merged.

Like Jordan says, we have benchmarks for parsing, and we'll probably need to add some more benchmarks to make sure that the new error machinery doesn't cause too much slowdown.

We also would like to make sure that the new error machinery is simple for the simple case, like you said, for

people who are encountering monadic combinator parsing for the first time
people who will have to upgrade to the breaking change

Can you point me to an example of a library (or a paper or a blog post) which you think describes multiple errors correctly? Maybe Megaparsec, or something else?

Another thing to think about... maybe if we're going to improve the ParseError type then we should improve the Indents module at the same time? #172

Answer 7 · 2023-04-06T08:37:06.000Z

Also if we're going to do multiple errors then maybe we should do lazy error messages at the same time see #158