AsciiDoc Parsing Lab

This project is a laboratory for researching and developing a formal grammar (i.e., grammar formalism) for the AsciiDoc Language. The source code in this repository is highly experimental and incomplete. It should not be considered a standalone AsciiDoc processor implementation. Rather, it’s meant to serve as a reference for proofs. The grammars and grammar-related helpers formulated in this repository will be contributed to the AsciiDoc Language Specification and TCK.

What’s this about?

The development of the AsciiDoc Language Specification is now in full swing. The center piece of this specification is the normative definition of the AsciiDoc Language.

Up until now, the AsciiDoc syntax rules have only been informally described through pre-spec implementation code (Asciidoctor) and user-oriented documentation (https://docs.asciidoctor.org/asciidoc/latest/). Developing a specification for the AsciiDoc Language necessitates formalizing the syntax into a grammar.

A formal grammar describes the sequences of characters (i.e., markup) that are valid according to the syntax using a set of rules. Establishing a formal grammar is a major step forward for the AsciiDoc Language and its specification. It will help root out well-known inconsistencies, ambiguities, and idiosyncrasies. However, bridging this gap while retaining reasonable compatibility is a major challenge of the specification that requires substantial and open-ended research, hence the need for this project.

This lab is primarily focused on exploring a PEG grammar for AsciiDoc.

How do I run it?

This repository is structured as a Node.js project with a Mocha test suite. Thus, in order to work with it, you first need to have Node.js installed.

The best way to install Node.js is to use nvm (Node Version Manager).

$ nvm install 19

Once Node.js 19 is installed, switch to it:

$ nvm use

Next, install the dependencies of the project using npm:

$ npm i

At the core of this repository is a collection of parsers for AsciiDoc. The parsers are generated from grammar files located in the grammar folder. The grammars, which end in .pegjs, are written for peggy. peggy is the parser generator used by this project to the generate the parsers.

The code in this repository is intended to be run by way of the test suite. But in order to run the tests, you first need to use the npm script gen to generate the parsers.

$ npm run gen

Now you can run the tests using the npm test script:

$ npm t

Most of the tests are data-driven. These tests are located in the test/tests folder. Each test consists of at least an input file (ending in -input.adoc) and an output file (ending in -output.json). The input file (i.e., the test file) is an AsciiDoc file. The output file is the expected ASG that should be produced from it by a compliance AsciiDoc processor. Some tests also have a configuration file that ends in -config.yml).

These data-driven tests are a blueprint of the tests that will be included in the AsciiDoc TCK. Once the tests are contributed to the AsciiDoc TCK, this test suite will be updated to use the AsciiDoc TCK directly.

Copyright and License

Use of this software is granted under the terms of the Eclipse Public License v 2.0 (EPL-2.0) License.

Trademarks

AsciiDoc® and AsciiDoc Language™ are trademarks of the Eclipse Foundation, Inc.

mojavelinux/asciidoc-parsing-lab

AsciiDoc Parsing Lab

What’s this about?

How do I run it?

Copyright and License

Trademarks