MsgPack.jl
MsgPack.jl is a MessagePack implementation in pure Julia, inspired by JSON3.jl. This package supports:
- (de)serialization of Julia values to/from MessagePack (see
pack
andunpack
) - overloadable pre-(de)serialization transformations (see
from_msgpack
andto_msgpack
) - automatic type construction/destruction (see
msgpack_type
,construct
, andStructType
) - some basic immutable "views" over MsgPack-formatted byte buffers (see
ArrayView
andMapView
). - native
Serialization.serialize
support via MessagePack Extensions (seeExtension
,extserialize
, andextdeserialize
)
pack
/unpack
Use pack
to serialize Julia values to MessagePack bytes, and unpack
to deserialize MessagePack bytes to Julia values:
julia> bytes = pack(["hello", Dict(:this => 1, ['i', 's'] => 3.14, "messagepack!" => nothing)])
42-element Array{UInt8,1}:
0x92
0xa5
0x68
⋮
julia> unpack(bytes)
2-element Array{Any,1}:
"hello"
Dict{Any,Any}("messagepack!" => nothing,"this" => 0x01,Any["i", "s"] => 3.14)
pack
and unpack
also accept IO streams as arguments:
julia> io = IOBuffer();
julia> pack(io, "see it really does take an IO stream");
julia> unpack(seekstart(io))
"see it really does take an IO stream"
Translating between Julia and MessagePack types
By default, MsgPack defines (de)serialization between the following Julia and MessagePack types:
MessagePack Type | AbstractMsgPackType Subtype |
Julia Types |
---|---|---|
Integer | IntegerType |
UInt8 , UInt16 , UInt32 , UInt64 , Int8 , Int16 , Int32 , Int64 |
Nil | NilType |
Nothing , Missing |
Boolean | BooleanType |
Bool |
Float | FloatType |
Float32 , Float64 |
String | StringType |
AbstractString , Char , Symbol |
Array | ArrayType |
AbstractArray , AbstractSet , Tuple |
Map | MapType |
AbstractDict , NamedTuple |
Binary | BinaryType |
(no defaults) |
Extension | ExtensionType |
(no defaults) |
To support additional Julia types, we can define that type's "translation" to its corresponding AbstractMsgPackType
via the following methods:
julia> using MsgPack, UUIDs
# declare `UUID`'s correspondence to the MessagePack String type
julia> MsgPack.msgpack_type(::Type{UUID}) = MsgPack.StringType()
# convert UUIDs to a MessagePack String-compatible representation for serialization
julia> MsgPack.to_msgpack(::MsgPack.StringType, uuid::UUID) = string(uuid)
# convert values deserialized as MessagePack Strings to UUIDs
julia> MsgPack.from_msgpack(::Type{UUID}, uuid::AbstractString) = UUID(uuid)
julia> unpack(pack(uuid4()))
"df416048-e513-41c5-aa49-32623d5d7e1f"
julia> unpack(pack(uuid4()), UUID)
UUID("4812d96f-bc7b-434b-ac54-1985a1263882")
Note that each subtype of AbstractMsgPackType
makes its own assumptions about the return values of to_msgpack
and from_msgpack
; these assumptions are documented in the subtype's docstring. For additional details, see the docstrings for AbstractMsgPackType
, msgpack_type
, to_msgpack
, and from_msgpack
.
struct
(de)serialization
Automatic MsgPack provides an interface that facilitates automatic, performant (de)serialization of MessagePack Maps to/from Julia struct
s. Like JSON3.jl, MsgPack's interface supports two different possibilities: a slower approach that doesn't depend on field ordering during deserialization, and a faster approach that does:
julia> using MsgPack
julia> struct MyMessage
a::Int
b::String
c::Bool
end
julia> MsgPack.msgpack_type(::Type{MyMessage}) = MsgPack.StructType()
julia> messages = [MyMessage(rand(Int), join(rand('a':'z', 10)), rand(Bool)) for _ in 1:3]
3-element Array{MyMessage,1}:
MyMessage(4625239811981161650, "whosayfsvb", true)
MyMessage(4988660392033153177, "mazsmrsawu", false)
MyMessage(7955638288702558596, "gueytzhjvy", true)
julia> bytes = pack(messages);
# slower, but does not assume struct field ordering
julia> unpack(bytes, Vector{MyMessage})
3-element Array{MyMessage,1}:
MyMessage(4625239811981161650, "whosayfsvb", true)
MyMessage(4988660392033153177, "mazsmrsawu", false)
MyMessage(7955638288702558596, "gueytzhjvy", true)
# faster, but assumes incoming struct fields are ordered
julia> unpack(bytes, Vector{MyMessage}; strict=(MyMessage,))
3-element Array{MyMessage,1}:
MyMessage(4625239811981161650, "whosayfsvb", true)
MyMessage(4988660392033153177, "mazsmrsawu", false)
MyMessage(7955638288702558596, "gueytzhjvy", true)
Do not use strict=(T,)
unless you can ensure that all MessagePack Maps corresponding to T
maintain the exact key-value pairs corresponding to T
's fields in the exact same order as specified by T
's Julia definition. This property generally cannot be assumed unless you, yourself, were the original serializer of the message.
For additional details, see the docstrings for StructType
, unpack
, and construct
.
Immutable, lazy Julia views over MessagePack bytes
Often, one will want to delay full deserialization of a MessagePack collection, and instead only deserialize elements upon access. To facilitate this approach, MsgPack provides the ArrayView
and MapView
types. Reusing the toy MyMessage
from the earlier example:
julia> using BenchmarkTools
julia> bytes = pack([MyMessage(rand(Int), join(rand('a':'z', 10)), rand(Bool)) for _ in 1:10_000_000]);
# deserialize the whole thing in one go
julia> @time x = unpack(bytes, Vector{MyMessage});
3.547294 seconds (20.00 M allocations: 686.646 MiB, 13.42% gc time)
# scan bytes to tag object positions, but don't fully deserialize
julia> @time v = unpack(bytes, MsgPack.ArrayView{MyMessage});
0.462374 seconds (14 allocations: 76.295 MiB)
# has normal `Vector` access performance, since it's a normal `Vector`
julia> @btime $x[1]
1.824 ns (0 allocations: 0 bytes)
MyMessage(-5988715016767300083, "anrcvpbqge", true)
# access time is much slower, since element is deserialized upon access
julia> @btime $v[1]
274.990 ns (4 allocations: 176 bytes)
MyMessage(-5988715016767300083, "anrcvpbqge", true)
For additional details, see the docstrings for ArrayView
and MapView
.
Should I use JSON or MessagePack?
Use JSON by default (with the lovely JSON3 package!), and only switch to MessagePack if you actually measure a significant performance benefit from doing so. In my experience, the main potential advantage of MessagePack is improved (de)serialization performance for certain kinds of structures. If you merely seek to reduce message size, MessagePack has little advantage over JSON, as general-purpose compression seems to achieve similar sizes when applied to either format.