/toker

Patches Erlang compiler with pluggable token transformers and parsers

Primary LanguageErlangApache License 2.0Apache-2.0

The toker application

Authors: Ulf Wiger (ulf@wiger.net).

Pluggable parsers for the Erlang compiler

This project started as an experiment with runtime code injection using the parse_trans library.

(Note, 'Toker' is the Swedish name for Dopey the dwarf, but also obviously a word play refering to token transformation.)

This library installs itself into the Erlang compiler, allowing modules to switch parser modules as well as install token transformers.

Example: the toker_test module uses a custom syntax:

-module(toker_test).

-export([double/1, i2l/1]).

-toker_parser(toker_erl_parse).

double(L) ->
    lists:map(`(X) -> X*2`, L).

i2l(L) ->
    lists:map(`integer_to_list/1, L).

This module will not compile with the standard Erlang parser, but toker's own build chain bootstraps itself and installs a hook in the erl_parse module, which then detects the instruction -toker_parser(toker_erl_parse).

Demonstration:

Eshell V5.10.3  (abort with ^G)
1> toker_test:double([1,2,3]).
[2,4,6]
2> toker_test:i2l([1,2,3]).
["1","2","3"]

In the above example, we assume that toker_test has been compiled with rebar compile. The rebar.config file in toker makes use of the erl_first_files option and a parse transform to bootstrap the compiler patch. Specifically:

  • toker_c.erl contains the basic erl_parse modifications, and is compiled first.
  • toker_pt.erl is a parse transform which leaves the forms untouched, but ensures that toker is initialized.
  • toker_bootstrap.erl is an empty module which only exists to trigger thetoker_pt parse transform.

Other applications could throw in a reference to the toker_pt parse transform in order to activate toker, but remember that parse transforms are only called after the module has been parsed, so if a module contains unconventional grammar, the parse transform must be called in a preceding module.

Instructions recognized by toker are:

  • -toker_parser(Mod) - where Mod must export Mod:parse_form(Tokens), which must return a valid erlang abstract form.
  • -toker_token_transform(Mod) - where Mod must exportMod:transform_tokens(Tokens), which must return a list of tokens. The function toker_c:transform_tokens/1 returns the tokens unchanged.
  • -toker_reset(Type) - where Type is either parser, token_transform orall, restores the relevant settings to the default.

Note that a token transform must return a list of tokens corresponding to a valid Erlang form (possibly after being processed by another parser). The Erlang parser has no support for skipping a part of the token stream.

The toker application

The toker compiler patch can also be installed by starting the toker application.

Example:

Eshell V5.10.3  (abort with ^G)
1> compile:file("src/toker_test", [{outdir,"ebin"},report]).
src/toker_test.erl:8: syntax error before: '`'
src/toker_test.erl:11: syntax error before: '`'
src/toker_test.erl:3: function double/1 undefined
src/toker_test.erl:3: function i2l/1 undefined
error
2> application:start(toker).
ok
3> compile:file("src/toker_test", [{outdir,"ebin"},report]).
{ok,toker_test}
4> toker_test:double([1,2,3]).
[2,4,6]

Using toker with erlc

In order to get erlc to pick up the toker functionality, ensure that toker has been compiled and is in the path, then set ERLC_EMULATOR="erl -s toker"

Example:


toker uwiger$ erlc -o ebin src/toker_test.erl
src/toker_test.erl:8: syntax error before: '`'
src/toker_test.erl:11: syntax error before: '`'
src/toker_test.erl:3: function double/1 undefined
src/toker_test.erl:3: function i2l/1 undefined

toker uwiger$ ERLC_EMULATOR="erl -s toker" erlc -o ebin src/toker_test.erl
toker uwiger$ erl -pa ebin
Erlang R16B02 (erts-5.10.3) [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V5.10.3  (abort with ^G)
1> toker_test:double([1,2,3]).
[2,4,6]

Rebar plugin

A rebar plugin can be found in toker/util/toker_rebar_plugin.erl. It bootstraps the toker functionality in pre_compile and pre_eunit for any application that has toker in its 'deps' list.

Examples

Apart from the src/toker_test.erl module, the examples/ directory contains examples of e.g. token transforms (implementing a very simple macro pre-processor in tt1.erl, used by m1.erl.)

TODO

  • Currently, it isn't possible to replace the scanner. Among other things, this means that forms must be terminated by '.'. Replacing the scanner would require some additional patch (which is probably doable).

Modules

toker
toker_app
toker_bootstrap
toker_c
toker_erl_parse
toker_pt
toker_server
toker_sup