
Slaw is a lightweight library for rendering and generating Akoma Ntoso acts from plain text and PDF documents.

Slaw is a lightweight library for generating Akoma Ntoso 3.0 Act XML from plain text documents. It is used to power Indigo and uses grammars developed for the legal tradition in South Africa, although others traditions are supported.

Slaw allows you to:

  1. parse plain text and transform it into an Akoma Ntoso Act XML document
  2. unparse Akoma Ntoso XML into a plain-text format suitable for re-parsing

Slaw is lightweight because it wraps around a Nokogiri XML representation of the parsed document. It provides some support methods for manipulating these documents, but anything advanced must manipulate the XML directly.


Add this line to your application's Gemfile:

gem 'slaw'

And then execute:

$ bundle

Or install it with:

$ gem install slaw

The simplest way to use Slaw is via the commandline:

$ slaw parse myfile.text --grammar za


Slaw generates Acts in the Akoma Ntoso 2.0 XML standard for legislative documents. It first parses plain text using a grammar and then generates XML from the resulting syntax tree.

Most by-laws in South Africa are available as PDF documents. You will therefore need to extract the text from the PDF first, using a tool like pdftotext. PDFs can product oddities (such as oddly wrapped lines) and Slaw has a number of rules-of-thumb for correcting these. These rules are based on South African by-laws and may not be suitable for all regions.

The grammar is expressed as a Treetop grammar and has been developed specifically for the format of South African acts and by-laws. Grammars for other regions could de developed depending on the complexity of a region's formats.

The grammar cannot catch some subtleties of an act or by-law -- such as nested list numbering -- so Slaw performs some post-processing on the XML produced by the parser. In particular, it nests lists correctly.


Slaw uses Treetop to compile a grammar into a backtracking parser. The parser builds a parse tree, the nodes of which know how to serialize themselves in XML format.

Supporting formats from other country's legal traditions probably requires creating a new grammar and parser.

Adding your own grammar

Slaw can dynamically load your custom Treetop grammars. When called with --grammar xy, Slaw tries to require slaw/grammars/xy/act and instantiate the parser class Slaw::Grammars::XY::ActParser. Slaw always uses the rule act as the root of the parser.

You can create your own grammar by creating a gem that provides these files and classes.


  1. Fork it at http://github.com/longhotsummer/slaw/fork
  2. Install dependencies: bundle install
  3. Create your feature branch: git checkout -b my-new-feature
  4. Write great code!
  5. Run tests: rspec
  6. Commit your changes: git commit -am 'Add some feature'
  7. Push to the branch: git push origin my-new-feature
  8. Create a new Pull Request


  1. Update lib/slaw/version.rb
  2. Run rake release


