kevinmehall/rust-peg

parse panics when the matched choice is a superstring of another choice

rts-rob opened this issue · 3 comments

I have a grammar with the following (extremely) simplified schema:

monadicVerb = {
  "Cos" |
  "Cosh"
}

For context, proper invocations in the host language look as follows:

  Cos(3.14159)
  // or
  Cosh(0)

Attempting to parse input with parse() fails when the input contains Cosh(...) but not when input contains Cos(...). This occurs when providing the same arguments and completely simplified to the standalone statements with a single constant argument.

This behavior also occurs with all other verbs in our grammar that are superstrings of another verb:

  • Collections => Collection
  • ContainsStrRegex => ContainsStr
  • Databases => Database
  • FindStrRegex => FindStr
  • Functions => Function
  • GTE => GT
  • Indexes => Index
  • LTE => LT
  • Logout => Log
  • Roles => Role
  • Singleton/Sinh => Sin
  • Tanh => Tan

I'm matching against the opening parenthesis as a workaround, e.g., Cosh( instead of Cos, so I'm not blocked.

PEG operator / (choice) is ordered which means that order in which you define your alternatives is matter. Superstrings should always be defined before substrings.

Could you show the grammar you're using? I assume you mean that it fails to match, not panics (unless you're .unwrap()ing the result, of course).

By default, rust-peg operates over characters without any concept of tokens, so yes, a string literal expression matches the characters of the string without regard for what may follow. If you want to look ahead and check that it's not followed by another identifier character, you have to explicitly do so with the ! operator.

@Mingun @kevinmehall I should have mentioned that it was failing when I defined all the superstrings before all the substrings as well, e.g., Cosh before Cos.

I say "was failing" because even though I saved that output now I cannot reproduce. 😮‍💨

May well be some sort of dependency caching issue on my machine. Appreciate the quick responses; I'm marking as closed since I cannot reproduce now. Will back out the workaround and reopen if it pops back up.