jcrist/msgspec

Nested Structs how to.

goodboy opened this issue ยท 8 comments

Sorry if I missed this in the docs (or tests which I tried to go through as well) but is it possible to create and decode a Struct with child structs that can be recursively decoded?

It seems the encoding works just fine but decoding will only work by decoding to a dict (the default):

[ins] In [2]: import msgspec

[nav] In [3]: class c(msgspec.Struct):
         ...:     s: msgspec.Struct

[nav] In [4]: class b(msgspec.Struct):
         ...:     val: int

[ins] In [5]: msgspec.encode(c(b(10)))
Out[5]: b'\x81\xa1s\x81\xa3val\n'

[ins] In [6]: msg = msgspec.encode(c(b(10)))

[ins] In [7]: msgspec.decode(msg)
Out[7]: {'s': {'val': 10}}

[ins] In [8]: msgspec.decode(msg, type=c)
Out[8]: c(s=Struct())

[ins] In [10]: msgspec.decode(msg, type=msgspec.Struct)
Out[10]: Struct()

If there is no plan to support nested structs natively, might there be a recommended recipe for this kind of thing?

For my purposes it would be lovely to have an IPC message format where payloads in certain types of messages could also be structs (who's schema could potentially be introspected from responding function annotations).

Cheers and thanks for the super sweet lib!

Also just as a follow up, what would be super ideal looking forward to python 3.10's new structural pattern matching is supporting stuff like:

from importlib import import_module
import msgspec


class Target(msgspec.Struct):
   module: str
   type: str
   func: str
   kwargs: dict
   
   
class Msg(msgpspec.Struct):
    cmd: str
    ipc_id: str
    payload: msgspec.Struct


async for msg in recv_stream:
    match:
        case Msg(cmd='cmd', payload=Target(type='async_func', module=m, func=f)):
            mod = import_module(m)
            await _invoke(getattr(mod, f), msg.payload.kwargs)
            
        case Msg(cmd='cmd', payload=Target(type='sync_func', module=m, func=f)):
            mod = import_module(m)
            _invoke_sync(getattr(mod, f), msg.payload.kwargs)
            
        case Msg(cmd='yield'):
            lookup_mem_chan(cmd.ipc_id).put_nowait(msg.payload)
            
        case Msg(cmd='error'):
            await cancel_task(msg.ipc_id)

Support for Struct decomposition and even further as constants like Enum would really up python's ability to create extremely terse event loops and state machines ๐Ÿ„๐Ÿผ

Apologies for missing this. Nested structs are fully supported, but the types must be fully known on decode. We don't support deserializing as subclasses of a type (or into a Union or "oneof" message). For example, the following works:

from msgspec import Struct

class Item(Struct):
    name: str
    count: int

class Cart(Struct):
    items: list[Item]

# both serialization and deserialization work fine
data = msgspec.encode(Cart([Item("banana", 2)])) 
msg = msgspec.Decoder(Cart).decode(data)

But this doesn't:

from msgspec import Struct

class Item(Struct):
    pass

class Apple(Item):
    count: int

class Banana(Item):
    count: int

class Cart(Struct):
    items: list[Item]

# serialization would work fine
data = msgspec.encode(Cart([Banana(2)])) 

# this would error, since there's no way to know from the serialized msg what `Item` subtype to use
msgspec.Decoder(Cart).decode(data)

Also just as a follow up, what would be super ideal looking forward to python 3.10's new structural pattern matching is supporting stuff like

From reading the pep I believe pattern matching will work out-of-the-box for keyword arguments in structs, but not positional arguments (a small patch would fix this, I'll open an issue). And serializing enums already works fine.

@jcrist sweet thanks for the explanation!

Nested structs are fully supported, but the types must be fully known on decode.

This was the part I was missing, of your if you're going to decode a struct you need to define how to decode it through a definition ๐Ÿคฆ๐Ÿผ.

For my purposes it does seem to work well ๐Ÿ„๐Ÿผ :

[nav] In [14]: class Data(Struct):
          ...:     key1: int
          ...:     key2: float
          ...:     key3: str
          ...:

[nav] In [15]: class Msg(Struct):
          ...:     msg_type: str
          ...:     value: Data
          ...:     cid: str

[nav] In [16]: msg = Msg('cmd', Data(10, 10.0, '10'), 'blah')

[nav] In [17]: data = msgspec.encode(msg)

[ins] In [18]: data
Out[18]: b'\x83\xa8msg_type\xa3cmd\xa5value\x83\xa4key1\n\xa4key2\xcb@$\x00\x00\x00\x00\x00\x00\xa4key3\xa210\xa3cid\xa4blah'

[ins] In [19]: msgspec.Decoder(Msg).decode(data)
Out[19]: Msg(msg_type='cmd', value=Data(key1=10, key2=10.0, key3='10'), cid='blah')

From reading the pep I believe pattern matching will work out-of-the-box for keyword arguments

so slickk!
really stoked to get to try this out ๐Ÿ˜Ž

I think I can close this now since it was just my misunderstanding of the decoding requirements.
Would you take a small PR to update the docs with an example such as this?

PS: for those interested in the 3.10 pattern matching support it's #28

Would you take a small PR to update the docs with an example such as this?

Sure thing. Would you want to submit a PR? Otherwise I'll take care of it.

Would you want to submit a PR?

Yah more then happy to since I already have a real-world use case: IPC messages which can describe their "value" as a Struct.

I've updated the docs with more examples, I believe this can be closed.

๐Ÿ˜ฟ it was on my list for forever but I never got to it..

Great to see the docs updated and thanks again for the great support on this project ๐Ÿ„๐Ÿผ