from parse and nix, or alternatively from parsnip
A simple implementation of parser combinators in nix. Depends only on builtins
and a bit of pkgs.lib
Because writing parsers is fun; and writing parser combinators is fun; and because writing a parser combinator in a functional language poses additional, fun challenges.
-
Don't! This code is written to be read. It copies strings all over the place for a nicer implementation, but that makes it very slow.
- Instead, use a derivation and a parser written in another language to parse what you need; IFD if you want the value in pure nix.
- If you cannot do that, try a regex.
- If you cannot do that, try another parsing library.
-
The structure of the top-level attrset is:
{ utils = ...; parsers = ...; demo = ...; }
parsers
contains the basic building blocks on top of which you can implement more fancy combinators. -
The code is meant to be read in a top-down manner, starting from
tag
andtakeWhile
.- every parser is a function of the type
parser = arg0: arg1: argn: str: definition
wherearg1
throughargn
are zero or more parser-specific arguments of unspecified types, andstr
is the remaining string to parse. - parsers always return an attrset.
- a parser which was successful returns an attrset in the following format:
{ remaining = ...; # string, a substring of the argument `str` results = []; # list of dynamic types. }
Results are always a list because this makes implementing
seq
nicer. This additionally hinders performance.- a parser which failed returns an attrset in the following format:
{ remaining = ...; error = ""; }
The first failure is always returned1; there is no error trace kept.
- every parser is a function of the type
The project includes a demo for a (non complete and non standards-compliant), basic URL parser. You can run it like:
$ nix repl -f 'parsnix.nix'
nix-repl> :p demo.url3
https://someone:password@subdomain.example.com:80/a/b?k1=v1&k2=v2#fragment
nix-repl> :p demo.parseUrl demo.url3
{
remaining = "";
results = [
{
fragment = "fragment";
host = "subdomain.example.com";
params = {
k1 = "v1";
k2 = "v2";
};
path = "/a/b";
port = 80;
scheme = "https";
userinfo = {
password = "password";
username = "someone";
};
}
];
}
Footnotes
-
except if you explicitly ignore it via some parser like
opt
oralt
↩