This is a simple parser generator based on the Parsing Expression Grammar formalism.
Add to your Cargo.toml:
[dependencies]
peg = "0.1.0"
Add to your crate root:
#![feature(plugin)]
#![plugin(peg_syntax_ext)]
Use peg_file! modname("mygrammarfile.rustpeg");
to include the grammar from an external file. The macro expands into a module called modname
with functions corresponding to the #[pub]
rules in your grammar.
Or, use
peg! modname(r#"
// grammar rules here
"#);`
to embed a short PEG grammar inline in your Rust source file. Example.
Run peg input_file.rustpeg
to compile a grammar and generate Rust code on stdout.
use super::name;
The grammar may begin with a series of use
declarations, just like in Rust, which are included in
the generated module. Since the grammar is in its own module, you must use super::StructName;
to
access a structure from the parent module.
#[pub]
rule_name -> type
= expression
If a rule is marked with #[pub]
, the generated module has a public function that begins parsing at that rule.
.
- match any single character"literal"
- match a literal string[a-z]
- match a single character from a set[^a-z]
- match a single character not in a setrule
- match a production defined elsewhere in the grammar and return its resultexpression*
- Match zero or more repetitions ofexpression
and return the results as aVec
expression+
- Match one or more repetitions ofexpression
and return the results as aVec
expression?
- Match one or zero repetitions ofexpression
. Returns anOption
&expression
- Match only ifexpression
matches at this position, without consuming any characters!expression
- Match only ifexpression
does not match at this position, without consuming any charactersexpression ** delim
- Match zero or more repetitions ofexpression
delimited withdelim
and return the results as aVec
expression ++ delim
- Match one or more repetitions ofexpression
delimited withdelim
and return the results as aVec
e1 / e2 / e3
- Try to match e1. If the match succeeds, return its result, otherwise try e2, and so on.e1 e2 e3
- Match expressions in sequencea:e1 b:e2 c:e3 { rust }
- Match e1, e2, e3 in sequence. If they match successfully, run the Rust code in the action and return its result. The variables before the colons in the preceding sequence are bound to the results of the corresponding expressions
Match actions can extract data from the match using these variables:
- match_str - the matched string, as a
&str
slice. Examples:
name -> String
= [a-zA-Z0-9_]+ { match_str.to_string() }
number -> int
= [0-9]+ { from_str::<u64>(match_str).unwrap() }
- start_pos - the index into the string at which the match starts, inclusive
- pos - the index into the string at which the match ends, exclusive
- Improve parse error reporting
- Memoization
- Support passing user-specified objects (e.g. filename for source mapping, string interner) into action code