/re-dfa-pm

Perl module re::DFA

Primary LanguagePerl

Description

    This is a study project for regular expression implementation.
    Only the most basic regular expression syntax is supported, i.e.,
    the closure (*) operator, the alternation (|) operator, and the
    concatenation operator.

Installation

    make -f meta.mk
    perl Makefile.PL
    make

Usage:

* As an NFA generator:

    perl -Ilib script/re2nfa.pl '(a|b)*(aa|bb)(a|b)*'

  It will generate one image file, NFA.png:

    http://agentzh.org/misc/re/NFA.png

* As a DFA generator:

    perl -Ilib script/re2dfa.pl '(a|b)*(aa|bb)(a|b)*'

  It will generate two image files, DFA.png and DFA.min.png,
  showing the DFA automata and the minimized DFA automata:

    http://agentzh.org/misc/re/DFA.png

    http://agentzh.org/misc/re/DFA.min.png

* As a regular expression compiler:

    perl -Ilib script/re2pl.pl '(a|b)*(aa|bb)(a|b)*'

  will generate a Perl subroutine's source code to implement
  the corresponding minized DFA automata:

    http://agentzh.org/misc/re/sample.pl

  On the other hand, we can also compile regular expressions
  down to C code:

    perl -Ilib script/re2c.pl '(a|b)*(aa|bb)(a|b)*'

  The generated C code can be seen here:

    http://agentzh.org/misc/re/sample.c

* As a regular expression parser:

    perl -Ilib script/re2xml.pl '(a|b)*(aa|bb)(a|b)*'

  It will output the XML representation for the parsed regular
  expression's AST (Abstract Syntax Tree) structure:

    http://agentzh.org/misc/re/sample.xml

* As a regular expression interpreter:

    perl -Ilib script/evalre.pl -c '(a|b)*(aa|bb)(a|b)*' "abbbf"

  This command will run the regular expression against the text "abbbf"
  using the C backend (specified by the -c option). It will output

    match: 'abbb'

  The evalre.pl script uses the Inline::C Perl module to run the C code
  generated by the regular expression compiler with the C backend.

  Also, it's possible to use the Perl backend to run the regular expression,
  by specifying the -p option:

    perl -Ilib script/evalre.pl -p '(a|b)*(aa|bb)(a|b)*' "abbbf"