mrkkrp/megaparsec

Why does `label` act only on the first set of hints?

j-mie6 opened this issue · 2 comments

When you have a parser like:

label name (optional p *> optional q) *> r

If r fails, the hints from p and q are usually reported as possible expected tokens. However, in this case ps hints are relabelled to name, but qs are not. This is clearly the expected behaviour from the library as implemented explicitly by refreshLastHints. What I'm wondering, however, is what the rationale was behind this design choice?

zuqq commented

This is not exactly the same situation as in your example, but I recently also encountered a case where label was less effective than desired.

Running

{-# LANGUAGE OverloadedStrings #-}

import Data.Void (Void)
import Text.Megaparsec (Parsec, errorBundlePretty, label, parse)
import Text.Megaparsec.Char (space)

main :: IO ()
main = do
    let s :: Parsec Void String ()
        s = label "spaces" space

    let p :: Parsec Void String String
        p = label "<p>" ("A" <* s)

    case parse (p *> p) mempty "A B" of
        Left e -> putStr (errorBundlePretty e)
        Right _ -> error "Unexpected `Right _`."

against 2a46074 prints the built-in label of space in the list of hints, even though the definition of s looks like it should override this label:

1:3:
  |
1 | A B
  |   ^
unexpected 'B'
expecting <p> or white space

Not knowing how labels are implemented in megaparsec, I found this behavior surprising; it is, however, explained in the documentation of label:

-- | The parser @'label' name p@ behaves as parser @p@, but whenever the
-- parser @p@ fails /without consuming any input/, it replaces names of
-- “expected” tokens with the name @name@.
label :: String -> m a -> m a

Since s succeeds, its label is disregarded.

Hey @j-mie6, that's a good catch. I think there is really no reason why we shouldn't modify both p and q in this case. I have opened a PR that implements that.