mransan/ocaml-protoc

problem with repeated fields

Opened this issue · 3 comments

I have a protobuf definition:

message Batch {
                ...
		repeated uint64 sequence = 2 [packed=true];
                ...
	}

which gets compiled into

type batch_mutable = {
  mutable sequence : int64 list;
}

We can fill it with an empty list, but we cannot create a batch message where the sequence field is absent. I think this should be possible

   repeated: this field can be repeated any number of times (including zero) 
   in a well-formed message. The order of the repeated values will be preserved.

Now, for us, the server side (which is an embedded device with a c++ server, and out or our control), is lenient enough to parse it, and treat both options the same, but other users might not be so lucky.

when you say cannot create a batch message where the sequence field is absent you mean at the binary level? In other word the empty list does make it to the binary somehow ?

indeed: at binary level, it's always encoded as a list of things, the smallest possible encoding is an empty list here. In essence, it might be more prudent to compile these to something like:

type batch_mutable = {
   mutable sequence : int64 list option;
}

or to at least allow the possibility to have something like this. (a compile time flag to select flavour or something like that). You can argue that protocol buffers allow silly things (and indeed, from an OCaml perspective, that seems to be true, but that's another discussion entirely)

(I think it was designed for languages like Python

   class BatchMutable:
       def __init__(self, sequence = None):
            self.sequence = sequence

)

ok so looking at this again

On the ocaml-protoc generated, If the list is empty then it could eithe be ommited in the binary or encoded as empty list. Both are valid binary representation. proto3 makes this a lot more explicit since all fields are optional and have defined defaults.

Furthermore when decoding ocaml-protoc will also support having a empty list for both binary representation.

This only thing not supported (but I would argue like proto3 that it is not needed) is the fact that we don't capture whether or not the field was present in the binary. (diffrentiate (Some[] and None) in your message.