/LuaOrgParser

[WIP] Hopefully, A PEG-compliant implementation of org-mode syntax, and a parser using lpeg in Lua

Primary LanguageLuaMIT LicenseMIT

LuaOrgParser

Purpose

This is a simple org-mode file parser written in Lua. The hidden objective is to later try to embed a library like this in neovim to help everyone work with Org-mode files in the editor.

Implementation

Since org-mode grammar is mostly context-sensitive but still mostly “left-to-right”, this implementation aims to parse an org-buffer/file using a state machine that will :

  • have various states corresponding to the current context (with associated relevant values),
  • take as “events” a new line from the buffer to parse, and
  • correctly update the current tree, as well as advance the state machine correctly.

Dependencies

Lua 5.1
since the end goal is to embed it in Neovim and Neovim bundles lua5.1 for all its integrations
LuaUnit
it is bundled in the tests folder

Specification used

The spec used has been extracted from worg and can be found here.

References in test files

Each test file has an associated .el version containing only the buffer parsed by org-mode. The code used to generate these files is:

(require 'org-element)

(defun parse-file-content (file)
  "Return the contents of FILE, as string."
  (with-temp-buffer
    (condition-case nil
        (progn
          (insert-file-contents file)
          (org-element-parse-buffer))
      (file-error
       (funcall (if noerror #'message #'user-error)
                "Unable to read file %S"
                file)
       nil))))

Roadmap [1/13]

Unit test framework [1/2]

Install Lua Unit

Add test files with increasing amount of features

Add AST types

All node types should be added ( the headline vs section vs block etc. ) with their attributes correctly translated.

The global tree structure that provides traversal should also be added, with accessors for parent/children etc. to manipulate the tree

Bland implementation

Parse all syntax from “spec” [0/6]

Parse headlines

Parse #+ blocks

Parse tables

Parse properties and other drawers

Parse timestamps

Parse text markups

Make the grammar dynamic

A few keywords in the grammar should be customizable at least before a string / file is parsed.