Could we chain muon on-the-wire data?
vshymanskyy opened this issue ยท 16 comments
Could we plainly chain (concatenate as cat
does) muon on-the-wire data and feed them into muon reader/parser?
(in the past I had fun thinking about somewhat similar idea but with not so much convincing result ๐)
Originally posted by @dumblob in #5 (comment)
@dumblob Yes it should be easy. I envision it like this:
For communication protocols, a streaming parser should be used, and the stream should begin with 0x90
(start list). Then a bunch of objects can be just concatenated. 0xFF
(padding) can be used for keep-alive. 0x91
(list end) can signal end of stream, after which connection should be cleanly terminated.
For file storage, I'm not sure what is the expected semantics for such a structure.
Overall, if multiple Muons are concatenated, you should just call parser in a loop until the end of data is reached.
Ok, sounds promising.
The main thing is, it has to be specified as non-optional (i.e. mandated) behavior that every parser has to support chaining. Otherwise chaining would not work.
Could you add it to the specification (and the example implementation)? Then we can close this ๐.
Thanks a lot!
Mandating this will effectively turn every Muon document into a list.
Correct. Not mandating this though means the exact opposite - zero support for chaining. I admit this is a tough decision.
OTOH having each document as a list does not bring any disadvantages, does it?
Maybe I'm dumb but I don't really get why mandatory support is necessary? Doesn't the way muon is designed mean that you can always just prepend a 0x90
byte before you cat
your muon files together, and finish it with a 0x91
, after which everything already behaves as required, no matter which parser we're talking about?
Yes it's also my reasoning about it.
The issue with this request is that:
- All muons become Lists, which is undesirable
- Semantics of concatenated Muon is unclear in most contexts (exept communication protocols, where each muon represents a message)
- There's no way to understand if multiple Muons are stored, when you start reading the file.
Tangentially: this does make me wonder how a muon parser should respond if I ask it to parse 92 93 92 93
(that is: two empty dictionaries in a row without an enclosing object). Should it throw an error, or just return the first empty dictionary?
Currently, the parser should stop parsing at the end of the first dict. You can call your parser in a loop to fetch all the data.
I'd throw an error if any data is found behind the first object in a file.
Semantics of the second dict is undefined:
- Should this be treated as a list? probably no
- Should second dict patch (overlay) the first dict? probably no
- What if we have dict and list, or dict and string?
Let's do this way: chaining possibility will be described in the specification docs, in line with my previous comments like #17 (comment)
I think this is misunderstanding. I meant exactly what you documented ๐. Except the word recommended
needs to get swapped for necessary
/mandatory
. That is it ๐.
Could we agree on changing the word recommended
? Because that is exactly what I pointed out above - any parser which wants to support chaining (not every parser has to - thanks to the wire format definition) is obliged to follow this "extended chaining format" (which includes also the notion of time - i.e. keep alive etc.).
Thoughts?
@dumblob added the following lines:
Whenever possible, tools and libraries should provide ways of working with concatenated objects. If for any reason it makes no sense in a specific application context, any data (except padding tag
0xFF
) that follows the first root object should be treated as an error.
Yeah, that clarifies the "not every parser has to support chaining" bit. The chaining definition itself though still contains:
For communication protocols, the following encoding is recommended:
which I am certain needs to be changed to this:
For communication protocols, the following encoding is required:
Shall I make a PR with the change of recommended
-> required
?