This project is a laboratory for researching and developing a formal grammar (i.e., grammar formalism) for the AsciiDoc Language. The source code in this repository is highly experimental and incomplete. It should not be considered a standalone AsciiDoc processor implementation. Rather, it’s meant to serve as a reference for proofs. The grammars and grammar-related helpers formulated in this repository will be contributed to the AsciiDoc Language Specification and TCK.
The development of the AsciiDoc Language Specification is now in full swing. The center piece of this specification is the normative definition of the AsciiDoc Language.
Up until now, the AsciiDoc syntax rules have only been informally described through pre-spec implementation code (Asciidoctor) and user-oriented documentation (https://docs.asciidoctor.org/asciidoc/latest/). Developing a specification for the AsciiDoc Language necessitates formalizing the syntax into a grammar.
A formal grammar describes the sequences of characters (i.e., markup) that are valid according to the syntax using a set of rules. Establishing a formal grammar is a major step forward for the AsciiDoc Language and its specification. It will help root out well-known inconsistencies, ambiguities, and idiosyncrasies. However, bridging this gap while retaining reasonable compatibility is a major challenge of the specification that requires substantial and open-ended research, hence the need for this project.
This lab is primarily focused on exploring a PEG grammar for AsciiDoc.
This repository is structured as a Node.js project with a Mocha test suite. Thus, in order to work with it, you first need to have Node.js installed.
The best way to install Node.js is to use nvm (Node Version Manager).
$ nvm install 19
Once Node.js 19 is installed, switch to it:
$ nvm use
Next, install the dependencies of the project using npm:
$ npm i
At the core of this repository is a collection of parsers for AsciiDoc. The parsers are generated from grammar files located in the grammar folder. The grammars, which end in .pegjs, are written for peggy. peggy is the parser generator used by this project to the generate the parsers.
The code in this repository is intended to be run by way of the test suite.
But in order to run the tests, you first need to use the npm script gen to generate the parsers.
$ npm run gen
Now you can run the tests using the npm test script:
$ npm t
Most of the tests are data-driven. These tests are located in the test/tests folder. Each test consists of at least an input file (ending in -input.adoc) and an output file (ending in -output.json). The input file (i.e., the test file) is an AsciiDoc file. The output file is the expected ASG that should be produced from it by a compliance AsciiDoc processor. Some tests also have a configuration file that ends in -config.yml).
These data-driven tests are a blueprint of the tests that will be included in the AsciiDoc TCK. Once the tests are contributed to the AsciiDoc TCK, this test suite will be updated to use the AsciiDoc TCK directly.
Copyright © 2023-present Dan Allen and Sarah White (OpenDevise Inc.) and the individual contributors to this project.
Use of this software is granted under the terms of the Eclipse Public License v 2.0 (EPL-2.0) License.