A Lua library for encoding and decoding the Erlang External Term Format.
Tested on Lua 5.1 - 5.4 and LuaJIT.
By default, this decodes and encodes with logic similar to erlpack, but decoders and encoders have options to customize how values are processed.
MIT (see file LICENSE).
Some files are third-party and contain their own licensing:
csrc/thirdparty/bigint/bigint.h: Zero-clause BSD, details in source file.csrc/thirdparty/miniz/miniz.h: MIT license, details incsrc/thirdparty/miniz/LICENSE.
local decoder = etf.decoder(options) -- create a decoder
local decoded = decoder:decode('\131\116\0\0\0\2\100\0\1\97\97\1\100\0\1\98\97\2')
-- decoded will be a table like:
{ a = 1, b = 2 }options is an optional table with the following keys, all optional:
use_integer- set totrueto decode all integers asetf.integeruserdata.use_float- set totrueto decode all floats asetf.floatuserdata.version- specify the Erlang Term Format version you wish to decode. As far as I can tell,131is the only version in existence.atom_map- customize how Atom types are decoded. This can be a table, or a function that accepts a string (representing the atom name) and a boolean (trueif the atom is a map key,falseotherwise).
Here's how various Erlang types are mapped to Lua by default:
| Supported | Erlang Type | Lua Type |
|---|---|---|
| [ ] | ATOM_CACHE_REF |
|
| [x] | ZLIB |
(automatically decompressed and decoded) |
| [x] | SMALL_INTEGER_EXT |
number |
| [x] | INTEGER_EXT |
number or etf.integer (based on value) |
| [x] | FLOAT_EXT |
number |
| [x] | PORT_EXT |
table |
| [x] | NEW_PORT_EXT |
table |
| [x] | V4_PORT_EXT |
table |
| [x] | PID_EXT |
table |
| [x] | NEW_PID_EXT |
table |
| [x] | SMALL_TUPLE_EXT |
table |
| [x] | LARGE_TUPLE_EXT |
table |
| [x] | MAP_EXT |
table |
| [x] | NIL_EXT |
table (empty) |
| [x] | STRING_EXT |
string |
| [x] | LIST_EXT |
table |
| [x] | BINARY_EXT |
string |
| [x] | SMALL_BIG_EXT |
number or etf.integer |
| [x] | LARGE_BIG_EXT |
number or etf.integer |
| [x] | REFERENCE_EXT |
table |
| [x] | NEW_REFERENCE_EXT |
table |
| [x] | NEWER_REFERENCE_EXT |
table |
| [x] | FUN_EXT |
table |
| [x] | NEW_FUN_EXT |
table |
| [x] | EXPORT_EXT |
table |
| [x] | BIT_BINARY_EXT |
string |
| [x] | NEW_FLOAT_EXT |
number |
| [x] | ATOM_UTF8_EXT |
string or boolean or etf.null |
| [x] | SMALL_ATOM_UTF8_EXT |
string or boolean or etf.null |
| [x] | ATOM_EXT |
string or boolean or etf.null |
| [x] | SMALL_ATOM_EXT |
string or boolean or etf.null |
etf will figure out the maximum and minimum integer values that can be
safely handled by Lua at run-time. When any integer is decoded, it will
use Lua's number type if possible, and a etf.integer userdata if it's
outside the safe range.
You can opt to have all integers be returned as etf.integer userdatas. The
benefit of this is all values will use the same type. On Lua 5.2 and later,
the etf.integer userdatas can be compared to regular Lua numbers, but on
Lua 5.1 you can only compare etf.integer values with other etf.integer
values.
To enable this, create the decoder with the use_integer option set to true:
local etf = require'etf'
local decoder = etf.decoder({use_integer = true })
local val = decoder:decode('\131\97\1') -- returns a integer
print(debug.getmetatable(val).__name)
-- prints "etf.integer"Erlang supports a concept of "atoms" which doesn't completely translate to Lua.
In Erlang, one can create a map like:
Map = #{ a => 1, b => false, c => hello }In that example, a, b, false, and hello are all atoms. They're
essentially small strings that can be used for map keys, enums, etc.
Note that Erlang doesn't have a boolean type. false is just another atom.
By default, atoms are decoded with the following logic:
- If the atom is a map key (like
aandbin the example), it's decoded as a string. - If the atom is a value (like
falseandhelloin the example, then:- Atom
trueis decoded as Lua's booleantrue. - Atom
falseis decoded as Lua's booleanfalse. - Atom
nilis decoded asetf.null, which is an atom userdata. - Anything else is decoded as a string.
- Atom
This is meant to be compatible with erlpack, and to make decoded data as easy to handle as possible.
If your application has other atoms that need to be translated into values,
you can specify the atom_map parameter. This can be a table with string keys,
or a function. The function should accept a string parameter representing
the atom name, and a boolean representing if the atom is a map key or not.
The default logic can be represented as:
local function atom_map(str, is_key)
if is_key then return str end
if str == 'true' then
return true
elseif str == 'false' then
return false
elseif str == 'nil' then
return etf.null
end
return str
endIf for example, you wanted to keep string keys but keep the values as atoms:
local function atom_map(str, is_key)
if is_key then return str end
return etf.atom(str)
endSMALL_TUPLE_EXT, LARGE_TUPLE_EXT, LIST_EXT, and NIL_EXT
will be decoded into array-like tables (all keys are integers, they're consecutive,
and they start at 1).
The table will have a metatable set to indicate the original type - etf.tuple_mt for tuples, and etf.list_mt for lists.
MAP_EXT will be decoded into a Lua table. By default, the keys are (probably) strings,
see above about how atoms are mapped. Values are mapped into the appropriate Lua type
according to the above table.
The table will have a metatable set to indicate it was a map - etf.map_mt.
The various PORT types (PORT_EXT, NEW_PORT_EXT, V4_PORT_EXT) will be decoded into
a table with the following fields:
node- a string.id- a number or integer.creation- a number or integer.
The table will have a metatable set to etf.port_mt.
The PID types (PID_EXT, NEW_PID_EXT) will be decoded into a table with
the following fields:
node- a string.id- a number or integer.serial- a number or integer.creation- a number or integer.
The table will have a metatable set to etf.pid_mt.
FUN_EXT will be decoded into a table with the following fields:
numfree- a number or integer.pid- the previously-mentionedPIDtype.module- a string.index- a number or integer.uniq- a number or integer.free_vars- an array like table of terms.
The table will have a metatable set to etf.fun_mt.
NEW_FUN_EXT will be decoded into a table with the following fields:
size- a number or integer.arity- a number.uniq- a string.index- a number or integer.numfree- a number or integer.module- a string.oldindex- a number or integer.olduniq- a number or integer.pid- the previously-mentionedPIDtype.free_vars- an array like table of terms.
The table will have a metatable set to etf.new_fun_mt.
EXPORT_EXT will be decoded into a table with the following fields:
module- a string.function- a string.arity- a number or integer.
The table will have a metatable set to etf.export_mt.
The REFERENCE types (REFERENCE_EXT, NEW_REFERENCE_EXT, NEWER_REFERENCE_EXT) will
be decoded into a table with the following fields:
node- a string.creation- a number or integer.id- an array-like table of numbers or integers.
The table will have a metatable set to etf.reference_mt.
local encoder = etf.encoder(options) -- create a encoder
local encoded = encoder:encode({ a = 1, b = 2 })
-- encoded will be a MAP_EXT with BINARY_EXT keys and SMALL_INT_EXT valuesoptions is an optional table with the following keys, all optional:
version- specify the Erlang Term Format version you wish to encode. As far as I can tell,131is the only version in existence.compress- set totrueto enable compression at the default level, or0through9to specify a compression level.value_map- customize how values are encoded, this can be a table or a function that accepts the value to be encoded, and a boolean indicating if the value is a table key.
Here's how various Lua types are mapped to Erlang Term Format by default:
| Supported | Lua Type | Erlang Type |
|---|---|---|
| [x] | nil |
a nil SMALL_ATOM_UTF8_EXT |
| [x] | number |
NEW_FLOAT_EXT, SMALL_INTEGER_EXT, INTEGER_EXT, SMALL_BIG_EXT, LARGE_BIG_EXT (as appropriate) |
| [x] | boolean |
SMALL_ATOM_UTF8_EXT |
| [x] | string |
BINARY_EXT |
| [x] | table |
NIL_EXT, LIST_EXT, or MAP_EXT |
| [x] | userdata |
(see details below) |
A table is determined to either be map-like or list-like. If a table
has integer keys starting at 1, with no gaps, it's considered to be
list-like and will be encoded as a LIST_EXT.
If a table has no keys at all, it will be treated as a list-type
with zero items and encoded as a NIL_EXT (Erlang's version of an
empty list).
Otherwise, the table is considered map-like, and will be encoded
as a MAP_EXT. All table keys will be encoded as strings (specifically
BINARY_EXT). This is meant to be compatible with erlpack.
etf allows creating various userdata to force a specific encoding:
| Userdata | Erlang Type |
|---|---|
etf.integer |
SMALL_INTEGER_EXT, INTEGER_EXT, SMALL_BIG_EXT, LARGE_BIG_EXT as appropriate |
etf.float |
NEW_FLOAT_EXT |
etf.string |
STRING_EXT |
etf.binary |
BINARY_EXT |
etf.atom |
SMALL_ATOM_UTF8_EXT or ATOM_UTF8_EXT |
etf.tuple |
TUPLE_EXT |
etf.list |
LIST_EXT |
etf.map |
MAP_EXT |
etf.port |
NEW_PORT_EXT or V4_PORT_EXT |
etf.pid |
NEW_PID_EXT |
etf.export |
EXPORT_EXT |
etf.reference |
NEWER_REFERENCE_EXT |
The etf.integer type will encoded to the smallest-possible integer. So, a integer
in the range of an 8-bit unsigned integer will be encoded as a SMALL_INTEGER_EXT value,
a integer in the range of a 32-bit signed integer will be encoded as an INTEGER_EXT value,
and so on.
Using these userdata with a custom value_map function allows precise control over
mapping. For example, if you want to use Atom types for all table keys, you could do:
local function value_map(val, is_key)
if is_key then
return etf.atom(val)
end
return val
end
local encoder = etf.encoder({value_map = value_map })
local binary = encoder:encode({ a = 1, b = 2 })
-- will return a MAP_EXT with atom keys and integer valuesdecoder- function that returns adecoderuserdata.decode- convenience function to decode without creating a decoder.
encoder- function that returns anencoderuserdata.encode- convenience function to encode without creating an encoder.
atom- function that returns anatomuserdata (requires a string).binary- function that returns abinaryuserdata (requires a string).string- a function that returns astringuserdata (requires a string).
integer- function that returns aintegeruserdata (accepts a number, string, or none).
float- function that returns afloatuserdata (accepts a number, string, or none).
list- function that returns alistuserdata (optionally accepts a table).map- function that returns amapuserdata (optionally accepts a table).tuple- a function that returns atupleuserdata (optionally accepts a table).
export- function that returns anexportuserdata (requires a table matchingEXPORT_EXTabove).pid- a function that returns apiduserdata (requires a table matchingPID_EXTabove).port- a function that returns aportuserdata (requires a table matchingPORT_EXTabove).reference- a function that returns areferenceuserdata (requires a table matchingREFERENCE_EXTabove).
maxinteger- aintegervalue representing the maximum integer that can be represented by Lua natively.mininteger- aintegervalue representing the minimum integer that can be represented by Lua natively.null- anatomthat represents anilatom.
atom_mt- theatomuserdata's metatable.integer_mt- theintegeruserdata's metatable.float_mt- thefloatuserdata's metatable.binary_mt- thebinaryuserdata's metatable.decoder_131_mt- thedecoderuserdata's metatable.encoder_131_mt- theencoderuserdata's metatable.export_mt- theexportuserdata's metatable.fun_mt- thefunuserdata's metatable.list_mt- thelistuserdata's metatable.map_mt- themapuserdata's metatable.new_fun_mt- thenew_funuserdata's metatable.pid_mt- thepiduserdata's metatable.port_mt- theportuserdata's metatable.reference_mt- thereferenceuserdata's metatable.string_mt- astringuserdata's metatable.tuple_mt- thetupleuserdata's metatable.
_VERSION- the module version as a string._VERSION_MAJOR- the module's major version as a number._VERSION_MINOR- the module's minor version as a number._VERSION_PATCH- the module's patch version as a number.
numsize- the size of a Lua number, in bytes.