Fast and Automatic Formatting of Context-Free Languages. Inspired by the automatic formatting of Golang.
The goal of padd is to act as a generic language formatter, capable of formatting any language that can be represented with a context-free grammar. A single specification file is used to specify how a language should be lexed, parsed, and then formatted during reconstruction.
The initial purpose of padd was as a whitespace formatter for programming languages, but can also be used for any CFGs.
The main library source is located under src/core
, and the cli specific source is located under src/cli
.
Integration tests and example specifications are located under tests
.
The padd formatter uses a specification language (defined here) to specify the alphabet of a language, a compressed DFA (CDFA) to lex the language, a grammar to parse it, and optional formatter patterns inside the grammar to indicate how the finished parse tree should be reconstructed. Example specifications can be found here, and more information about specifications can be found here.
The padd
cli can be used to format files or directories in place, overwriting the existing files if formatting is successfull. For more advanced usage information, see the docs.
$ ./padd fmt <specification file> -t <target path>
Format all *.java
files in ~/some-java-project
on 4 worker threads using the java8 specification:
$ ./padd fmt tests/spec/java8 -t ~/some-java-project --threads 4 -m ".*\.java"
extern crate padd;
use padd::{FormatJobRunner, FormatJob};
fn main() {
// Specification String
let spec = "
alphabet 'ab'
cdfa {
start
'a' -> ^A
'b' -> ^B;
}
grammar {
s `{} {}`
| s A
| s B
| `SEPARATED:`;
}
".to_string();
let input = "abbaba".to_string();
// Formatter Creation
let fjr = FormatJobRunner::build(&spec).unwrap();
// Format Input
let res = fjr.format(FormatJob::from_text(input)).unwrap();
// Verify Output
assert_eq!(res, "SEPARATED: a b b a b a");
}
The specification file:
alphabet ' \t\n{}'
cdfa {
start
' ' | '\t' | '\n' -> ^_
'{' -> ^LBRACKET
'}' -> ^RBRACKET;
}
grammar {
s
| s b
|;
b
| LBRACKET s RBRACKET `[prefix]{}\n\n{;prefix=[prefix]\t}[prefix]{}\n\n`;
}
The input:
{ { {{{ }}}
{} } } { {}
}
The output:
{
{
{
{
{
}
}
}
{
}
}
}
{
{
}
}