Certain PEG patterns don't match
stutonk opened this issue · 2 comments
Janet version 1.34.0-7d3acc0e
Recently noticed that some PEGs that I believe should match do not. Specifically, when you have a sequence
with an any
or some
followed by a literal from the same class. My original motivating use case was matching a number that ended in a particular digit like (peg/match '(* (some (range "09")) "3") "123453")
.
Simpler examples that fail to match include:
(peg/match '(* (any "a") "a") "a")
(peg/match '(* (some "a") "a") "aa")
(peg/match '(* (any "a") (some "a")) "a")
(peg/match '(* (some "a") (some "a")) "aa")
This is known and expected behavior—from the docs, "PEGs try to match an input text with a pattern in a greedy manner."
So, any
and some
will always greedily match as far as they're able to, leaving no characters left for your more specific pattern to match on.
There are ways to do what you're trying to do, it's just more complicated...
Ah, my mistake from not catching that detail in the docs. Thank you!