msgpack/msgpack-python

Support for set Serialization

Closed this issue Β· 4 comments

πŸš€ Feature Request: Add Support for Python set Serialization

🧩 Summary

Currently, msgpack-python raises a TypeError when attempting to serialize Python set objects. While this behavior is consistent with the MessagePack spec (which does not define a native set type), it would be incredibly useful for Python developers to have optional support for set serialization - either via a built-in fallback or configurable default function.


πŸ€” Why Is This Useful?

In Python, set is a commonly used data structure for enforcing uniqueness, fast lookups, and expressing unordered collections.
Manually converting sets to lists (and back) can be repetitive, error-prone, and easy to overlook.

Enabling graceful handling of set types would improve:

  • Developer experience
  • Code cleanliness
  • Compatibility in Python-native applications using msgpack

πŸ”§ Suggested Implementation

βœ… Option 1: Allow set via default=... Handler

Let users serialize sets by converting them to lists:

import msgpack

def default(obj):
    if isinstance(obj, set):
        return list(obj)
    raise TypeError(f"Unsupported type: {type(obj)}")

data = {"tags": {"python", "msgpack"}}
packed = msgpack.packb(data, default=default, use_bin_type=True)

The problem with that is the ambiguity when unpacking it. See also lists vs. tuples and the related unpacking options.

Thanks!
I understand your point about ambiguity during unpacking, especially since set, like tuple, doesn’t have a native representation in MessagePack.

So I assume python users can handle set serialization with tagging, something like this:

def default(obj):
    if isinstance(obj, set):
        return {"__type__": "set", "items": list(obj)}
    raise TypeError("Unsupported type")

def object_hook(obj):
    if isinstance(obj, dict) and obj.get("__type__") == "set":
        return set(obj["items"])
    return obj

This way, users who need to preserve sets can do it cleanly and explicitly, without affecting the core behavior.

Good idea - that would be also nice for automatic handling of lists vs. tuples.

But guess for best compatibility, this is rather something the msgpack user needs to do than something that needs to be changed in msgpack?

Yeah, it's optional to put it in the README 'Notes' section, but not strictly necessary.
Anyone looking for set support or running into similar behavior will likely come across this issue anyway.