Beware!
All the important features of this project (and many useful others) are now supported by Marpa::R2
's Scanless interface (SLIF). The rationale and todos are completely outdated/irrelevant; the code, however, would well work still due to Marpa::R2
's excellent backward compatibility.
So this repo is left here for purely illustrative/archive purposes. What exactly it illustrates is up to the reader. :) The author explicitly declares this code public domain for all intents and purposes.
MarpaX-Parse, MarpaX::Tool-to-be
Parts of this module will be refactored out into individual modules and probably distros as MarpaX::Tool::*, if they prove to be useful enough.
What It Is
This module aims at serving as a simple and powerful parsing interface to Marpa::R2 so that a user can:
- set the
'rules'
argument of Marpa::R2::Grammar to a string containing a BNF or EBNF grammar (which may embed%{ ... %}
actions), - call
parse
method on the input and receive the value produced by Marpa::R2 evaluator based on emdedded%{ ... %}
actions in textual grammar or closures (sub { ... }
) rather than semantic action names) set in Marpa::R2rules
- have literals extracted from the textual grammar or Marpa::R2
rules
and set up as regexes for lexer rules to tokenize the input for the recognizer, - set default_action to
'tree'
,'xml'
,'sexpr'
,'AoA'
, and'HoA'
, to haveparse
return a parse tree (Tree::Simple, XML string, S-expression string, array of arrays, and hash of arrays, accordingly), - call
show_parse_tree($format)
to view the parse tree as text dump, HTML or formatted XML; - use Tree::Simple::traverse, Tree::Simple::Visitor or XML::Twig to traverse the relevant parse trees and gain results.
Input can be a string or a reference to an array of tokens ([ $type, $value ]
refs).
Ambiguous tokens can be defined by setting the input array item(s) to
[ [ $type1, $value ], [ $type2, $value ] ] ...
and will be handled with
alternate()/earleme_complete()
input model.
Feature => Test(s)
Marpa::R2::Grammar rules transforms to handle quantified (?|*|+) symbols
Extraction of closures and lexer regexes from Marpa::R2::Grammar rules
An example from the Parse::RecDescent tutorial, done the Marpa way
A BNF grammar with actions that can parse a possible signed decimal number
A BNF grammar that can parse a BNF grammar that can parse a decimal number
An example from the Parse::RecDescent tutorial done in textual BNF with embedded actions
Parse trees generation and traversal
Comparison of parse tree evaluation
Parsing 'time flies like an arrow, bit fruit flies like a banana' sentence getting part of speech data from WordNet::QueryData (if installed) or pre-set hash ref (otherwise)
Pre-requisites:
Core (closures/lexer regexes extraction, quantified symbols, textual BNF with embedded actions, see test cases 02-07, 08 for details)
Marpa::R2
Clone
Eval::Closure
Math::Combinatorics
Parse Trees (set default_action to 'xml'
, 'tree'
, 'sexpr'
or 'AoA'
to have XML string, Tree::Simple, S-expression or array of arrays parse trees accordingly; use show_parse_tree("text" or "html")
to view Tree::Simple parse trees as text or html, see test cases 10, 11 and 13 for details))
Data::TreeDumper
Tree::Simple
Tree::Simple::Visitor
Tree::Simple::View
XML::Twig