kevinmehall/rust-peg

Support binding token literals in rules

cstrahan-blueshift opened this issue · 1 comments

The tokens I'm using in my grammar include info like text range, line, column, etc. It would be nice if I could match some of these tokens via the literal syntax and bind the resulting token to a variable for use within the rule's action expression.

Something like:

rule foo() -> u32
    = t:"blah" { t.line }

but when I try something like the above I get "cannot find value `t` in this scope".

I can work around this in a couple ways right now:

// I can use a rule like this:
rule nt() -> Token<'input> = ##next_token()

// using the undocumented ##member() syntax, with a suitable implementation on my impl Parser type:

impl<'input> TokenSlice<'input> {
    pub fn next_token(&self, pos: usize) -> RuleResult<Token<'input>> {
        match self.tokens.get(pos) {
            Some(t) => RuleResult::Matched(pos, *t),
            _ => RuleResult::Failed,
        }
    }
}

// and now I can do like so:
rule foo() -> u32
    = t:nt() "blah" { t.line }

(the above is what I'm using for now)

or

// I could create rules for each token kind:
rule BLAH() -> Token<'input>
    = token:[Token{ kind: TokenKind::Blah, .. }] { token }

rule foo() -> u32
    = t:BLAH() { t.line }

// But that gets tedious as I add new tokens, and I'd rather use the "blah" literal syntax instead of BLAH()

Another option is to use rule arguments:

rule tok(expected_kind: TokenKind) -> Token<'input>
    = token:[Token{ kind, .. } if kind == expected_kind] { token }

rule foo() -> u32
    = t:tok(TokenKind::Blah) { t.line }

But sure, I think it would make sense to add an associated type to the ParseLiteral trait and use it in the return type of parse_str_literal. I'd accept a PR for that for version 0.9 (since it would be a breaking change for anyone who's implemented that trait).