kevinmehall/rust-peg

Example of parsing with &[T] input

Closed this issue · 2 comments

I'm using an external lexer to generate a vec of Token. A Token is a struct, not an enum. I'd like to generate a parser over the tokens.

I marched through all of the examples and did not find one that showed how to do this. The closest one was here: https://github.com/kevinmehall/rust-peg/blob/master/tests/run-pass/tokens.rs , but in this case Token is an enum and it's easy to add matches like [Token::Open] to a rule.

What syntax do I use if my Token struct looks like this?


pub enum TokenType {
    Word,
    Number,
}

pub struct Token {
    pub token_type: TokenType,
    pub term: String,
    pub other_value_here: u32,
}

I can match based on TokenType alone, but I'm not sure what goes in the braces [].

First, the default implementation of the parsing traits for [T] expects T to be Copy, as in the [u8] or simple enum cases. Since you have a struct with a String in it that isn't Copy, you need to implement these traits on a wrapper struct that gives the parser &Token (which is Copy). If you already have some kind of container type for your tokens, you could implement the traits on this instead of a new wrapper.

The [] syntax works just like (and expands into) an arm of a match in regular Rust, so you can use a pattern that matches one field and ignores the rest:

[ Token { token_type: TokenType::Word, .. } ]

Or capture the token as a variable and then test it with an if guard, assuming TokenType has a PartialEq impl:

[t if t.token_type == TokenType::Word]

You could wrap this in a rule that accepts the TokenType as an argument:

rule tok(ty: TokenType) -> &'input Token = [t if t.token_type == ty]

rule number() = tok(TokenType::Number)