/colada-parser

A parser for Colada, a controlled natural language for the calculus of inductive constructions

Primary LanguageHaskellApache License 2.0Apache-2.0

colada-parser

About

colada-parser is a parser for Colada, a controlled natural language (CNL) for the Calculus of Inductive Constructions (CIC). It is still experimental and under heavy development.

This work was carried out as part of the Formal Abstracts project.

Project organization

The project is contained inside src/Colada. src/Colada/Basic/ (so modules in the Colada.Basic namespace) contains low-level parser combinators (Core.hs), utilities for managing the parser state (State.hs and ParserState.hs), and low-level parsers for the basic terminals of the grammar (literals, tokens, punctuation, delimiters) (Token.hs). These are all re-exported by Colada.Basic.Basic.

The topmost file in the project is Colada.Tests. This file imports and re-exports every other file in the project, and so should be loaded in a REPL when interactively developing the parser. The highest-level non-testing file is Colada.ProgramText. This imports all other parts of the grammar. The basic unit of a Colada document is a TextItem. The entire document must be parsed as a ProgramText, which wraps a list of TextItem s.

Colada.PhraseList contains utilities for dynamically extending the colada-parser using a simple markup language (tokens postfixed with a question mark can be optionally parsed, and (tk1 | tk2 | ... | tkn) will parse any of the |-separated tokens ). Marked-up input is parsed into the ParserMarkUp datatype, from which colada-parser generates a parser for later use in the text. The parser which handles the conversion to ParserMarkUp ignores OCaml-style line and block comments.

Build instructions

To build colada-parser, you must have the Haskell tool stack installed (see here). Navigate to the project directory and run stack build. After building the parser, you can test it on a CNL-compliant text file with stack exec colada path_to/my_file. If the parse is successful, the parse tree will be written to my_file.parsed in the same directory as my_file.

For example, stack exec colada test/test.txt should succeed.

Note: in the output, error messages may appear twice in a row with the same offset.

REPL instructions

From Emacs, enter C-c C-z from a Haskell-mode buffer with intero-mode enabled to open the intero REPL.

Otherwise, run the command

stack ghci --with-ghc ghci "--docker-run-args=--interactive=true --tty=false" --no-build --no-load colada-parser

in a shell from inside the colada-parser directory.

Ensure that the OverloadedStrings extension is enabled by e.g. entering :set -XOverloadedStrings. Load all files in the project with :load Colada.Tests. To test a parser myParser, enter

test myParser "my input string here"

at the prompt.

Author(s)