
Patches Erlang compiler with pluggable token transformers and parsers

Primary LanguageErlangApache License 2.0Apache-2.0

The toker application

Authors: Ulf Wiger (ulf@wiger.net).

Pluggable parsers for the Erlang compiler

This project started as an experiment with runtime code injection using the parse_trans library.

(Note, 'Toker' is the Swedish name for Dopey the dwarf, but also obviously a word play refering to token transformation.)

This library installs itself into the Erlang compiler, allowing modules to switch parser modules as well as install token transformers.

Example: the toker_test module uses a custom syntax:


-export([double/1, i2l/1]).


double(L) ->
    lists:map(`(X) -> X*2`, L).

i2l(L) ->
    lists:map(`integer_to_list/1, L).

This module will not compile with the standard Erlang parser, but toker's own build chain bootstraps itself and installs a hook in the erl_parse module, which then detects the instruction -toker_parser(toker_erl_parse).


Eshell V5.10.3  (abort with ^G)
1> toker_test:double([1,2,3]).
2> toker_test:i2l([1,2,3]).

In the above example, we assume that toker_test has been compiled with rebar compile. The rebar.config file in toker makes use of the erl_first_files option and a parse transform to bootstrap the compiler patch. Specifically:

  • toker_c.erl contains the basic erl_parse modifications, and is compiled first.
  • toker_pt.erl is a parse transform which leaves the forms untouched, but ensures that toker is initialized.
  • toker_bootstrap.erl is an empty module which only exists to trigger thetoker_pt parse transform.

Other applications could throw in a reference to the toker_pt parse transform in order to activate toker, but remember that parse transforms are only called after the module has been parsed, so if a module contains unconventional grammar, the parse transform must be called in a preceding module.

Instructions recognized by toker are:

  • -toker_parser(Mod) - where Mod must export Mod:parse_form(Tokens), which must return a valid erlang abstract form.
  • -toker_token_transform(Mod) - where Mod must exportMod:transform_tokens(Tokens), which must return a list of tokens. The function toker_c:transform_tokens/1 returns the tokens unchanged.
  • -toker_reset(Type) - where Type is either parser, token_transform orall, restores the relevant settings to the default.

Note that a token transform must return a list of tokens corresponding to a valid Erlang form (possibly after being processed by another parser). The Erlang parser has no support for skipping a part of the token stream.

The toker application

The toker compiler patch can also be installed by starting the toker application.


Eshell V5.10.3  (abort with ^G)
1> compile:file("src/toker_test", [{outdir,"ebin"},report]).
src/toker_test.erl:8: syntax error before: '`'
src/toker_test.erl:11: syntax error before: '`'
src/toker_test.erl:3: function double/1 undefined
src/toker_test.erl:3: function i2l/1 undefined
2> application:start(toker).
3> compile:file("src/toker_test", [{outdir,"ebin"},report]).
4> toker_test:double([1,2,3]).

Using toker with erlc

In order to get erlc to pick up the toker functionality, ensure that toker has been compiled and is in the path, then set ERLC_EMULATOR="erl -s toker"


toker uwiger$ erlc -o ebin src/toker_test.erl
src/toker_test.erl:8: syntax error before: '`'
src/toker_test.erl:11: syntax error before: '`'
src/toker_test.erl:3: function double/1 undefined
src/toker_test.erl:3: function i2l/1 undefined

toker uwiger$ ERLC_EMULATOR="erl -s toker" erlc -o ebin src/toker_test.erl
toker uwiger$ erl -pa ebin
Erlang R16B02 (erts-5.10.3) [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V5.10.3  (abort with ^G)
1> toker_test:double([1,2,3]).

Rebar plugin

A rebar plugin can be found in toker/util/toker_rebar_plugin.erl. It bootstraps the toker functionality in pre_compile and pre_eunit for any application that has toker in its 'deps' list.


Apart from the src/toker_test.erl module, the examples/ directory contains examples of e.g. token transforms (implementing a very simple macro pre-processor in tt1.erl, used by m1.erl.)


  • Currently, it isn't possible to replace the scanner. Among other things, this means that forms must be terminated by '.'. Replacing the scanner would require some additional patch (which is probably doable).

