Peg, Parsing Expression Grammar, is an implementation of a Packrat parser generator. A Packrat parser is a descent recursive parser capable of backtracking. The generated parser searches for the correct parsing of the input.
For more information see:
This Go implementation is based on:
-inline
Tells the parser generator to inline parser rules.
-switch
Reduces the number of rules that have to be tried for some pegs.
If statements are replaced with switch statements.
First declare the package name:
package <package name>
Then declare the parser:
type <parser name> Peg {
<parser state variables>
}
Next declare the rules. The first rule is the entry point into the parser:
<rule name> <- <rule body>
The first rule should probably end with '!.' to indicate no more input follows:
first <- . !.
'.' means any character matches. For zero or more character matches use:
repetition <- .*
For one or more character matches use:
oneOrMore <- .+
For an optional character match use:
optional <- .?
If specific charaters are to be matched use single quotes:
specific <- 'a'* 'bc'+ 'de'?
will match the string "aaabcbcde".
For choosing between differnt inputs use alternates:
prioritized <- 'a' 'a'* / 'bc'+ / 'de'?
will match "aaaa" or "bcbc" or "de" or "". The matches are attempted in order.
If the characters are case insensitive use double quotes:
insensitive <- "abc"
will match "abc" or "Abc" or "ABc" etc...
For matching a set of characters use a character class:
class <- [a-z]
will watch "a" or "b" or all the way to "z".
For an inverse character class start with a tilde:
inverse <- [~a-z]
will match anything but "a" or "b" or all the way to "z"
If the character class is case insensitive use double brackets:
insensitive <- [[A-Z]]
Use parentheses for grouping:
grouping <- (rule1 / rule2) rule3
For looking ahead for a match (predicate) use:
lookAhead <- &rule1 rule2
For inverse look ahead use:
inverse <- !rule1 rule2
Use curly braces for Go code:
gocode <- { fmt.Println("hello world") }
For string captures use less than greater than:
capture <- <'capture'> { fmt.Println(buffer[begin:end]) }
Will print out "capture". The captured string is stored in buffer[begin:end].
- bootstrap/main.go: bootstrap syntax tree of peg
- peg.go: syntax tree and code generator
- main.go: bootstrap main
- peg.peg: peg in its own language
There should be no differences between the bootstrap and self compiled:
./peg -inline -switch peg.peg
diff bootstrap.peg.go peg.peg.go
Andrew Snodgrass