janet-lang/janet

Certain PEG patterns don't match

stutonk opened this issue · 2 comments

Janet version 1.34.0-7d3acc0e

Recently noticed that some PEGs that I believe should match do not. Specifically, when you have a sequence with an any or some followed by a literal from the same class. My original motivating use case was matching a number that ended in a particular digit like (peg/match '(* (some (range "09")) "3") "123453").

Simpler examples that fail to match include:
(peg/match '(* (any "a") "a") "a")
(peg/match '(* (some "a") "a") "aa")
(peg/match '(* (any "a") (some "a")) "a")
(peg/match '(* (some "a") (some "a")) "aa")

This is known and expected behavior—from the docs, "PEGs try to match an input text with a pattern in a greedy manner."

So, any and some will always greedily match as far as they're able to, leaving no characters left for your more specific pattern to match on.

There are ways to do what you're trying to do, it's just more complicated...

Ah, my mistake from not catching that detail in the docs. Thank you!