brown/protobuf

I suspect a possible bug

eschulte opened this issue · 13 comments

It appears that parsing is getting thrown off by the attached hello.gtirb.zip protobuf file (zipped for upload).

Specifically, using:

The CL parser parses the top level IR, then down to the containing Module, into the Section and then the last message parsed is the SectionFlag, after which point it is thrown off and then it throws an effort in skip-field when it should have finished that section and moved on to the next section (the SectionFlag was the last message in that section). This same file appears to parse successfully using C++ and Python protobuf implementations.

This should be reproducible by:

  1. cloning the above gtirb project into your quicklisp local-projects
  2. (ql:quickload :gtirb)
  3. (gtirb:read-gtirb "/path/to/hello.gtirb")

Any advice?

FWIW, the following change (delimited with FIX HACK FIX) allows the attached file to parse successfully. I don't know why :)

(cl:defmethod pb:merge-from-array ((self section) buffer start limit)
  (cl:declare (cl:type com.google.base:octet-vector buffer)
              (cl:type com.google.base:vector-index start limit))
  (cl:do ((index start index))
      ((cl:>= index limit) index)
    (cl:declare (cl:type com.google.base:vector-index index))
    (cl:multiple-value-bind (tag new-index)
        (varint:parse-uint32-carefully buffer index limit)
      (cl:setf index new-index)
      (cl:case tag
        ;; bytes uuid = 1[json_name = "uuid"];
        ((10)
          (cl:multiple-value-bind (value new-index)
              (wire-format:read-octets-carefully buffer index limit)
            (cl:setf (cl:slot-value self 'uuid) value)
            (cl:setf (cl:ldb (cl:byte 1 0) (cl:slot-value self '%has-bits%)) 1)
            (cl:setf index new-index)))
        ;; string name = 2[json_name = "name"];
        ((18)
          (cl:multiple-value-bind (value new-index)
              (wire-format:read-octets-carefully buffer index limit)
            (cl:setf (cl:slot-value self 'name) (pb:string-field value))
            (cl:setf (cl:ldb (cl:byte 1 1) (cl:slot-value self '%has-bits%)) 1)
            (cl:setf index new-index)))
        ;; repeated .proto.ByteInterval byte_intervals = 5[json_name = "byteIntervals"];
        ((42)
          (cl:multiple-value-bind (length new-index)
              (varint:parse-uint31-carefully buffer index limit)
            (cl:when (cl:> (cl:+ new-index length) limit)
              (cl:error "buffer overflow"))
            (cl:let ((message (cl:make-instance 'proto::byte-interval)))
              (cl:setf index (pb:merge-from-array message buffer new-index (cl:+ new-index length)))
              (cl:when (cl:not (cl:= index (cl:+ new-index length)))
                (cl:error "buffer overflow"))
              (cl:vector-push-extend message (cl:slot-value self 'byte-intervals)))))
        ;; repeated .proto.SectionFlag section_flags = 6[json_name = "sectionFlags"];
        ((50)
          (cl:multiple-value-bind (value new-index)
              (varint:parse-int32-carefully buffer index limit)
            ;; XXXXX: when valid, set field, else add to unknown fields
            ;; FIX HACK FIX
            (cl:case value
              (3 (cl:incf new-index 3))
              (4 (cl:incf new-index 4)))
            ;; FIX HACK FIX
            (cl:vector-push-extend value (cl:slot-value self 'section-flags))
            (cl:setf index new-index)))
        (cl:t
          (cl:when (cl:= (cl:logand tag 7) 4)
            (cl:return-from pb:merge-from-array index))
          (cl:setf index (wire-format:skip-field buffer index limit tag)))))))
brown commented

The problem looks like lack of support for unknown fields. The proto file says that tags 3 and 4 are reserved. Most likely the encoded data you're trying to read contains those fields. The generated protobuf code is supposed to either skip over unknown fields or (much better) retain them. Unfortunately, my code implements neither of those. A workaround for now is to define fields with the unknown tag numbers. Look back in the git history to see what types they were before they were removed, then add back deprecated fields with the same tag numbers and types:

int32 DEPRECATED_FOO = 3;
string DEPRECATED_BAR = 4;

Ah, thanks for the explanation and suggested temporary work around.
(In this case I'm actually happy to know these fields are still being generated, because that should change upstream.)

I'll leave this issue open to track the lack of support for unknown fields.

Oh, actually the field numbers 3 and 4 in this case are not the deprecated IDs 3 and 4 on the Section message, but are the valid IDs 3 and 4 on the SectionFlag message. Given that I'm not sure what the problem is here.

Yeah, there is definitely something wrong with (at least) repeated enums. See this minimal attached example (run (ql:quickload 'pbtest) (pbtest:run) in a REPL then make at the shell).

pbtest.zip

Oh, after looking into this a little more the problem is that proto3 switched to a packed encoding for repeated enums by default and it looks like that is not currently supported.
https://developers.google.com/protocol-buffers/docs/encoding#optional
https://developers.google.com/protocol-buffers/docs/encoding#packed

This is a fairly serious bug which is fixed by my pull request. Do you anticipate having time to review this in the next couple of weeks?

brown commented

As you have discovered, the problem has to do with the handling of packed repeated fields.

Unfortunately, your pull request is not a complete fix. For repeated fields of varint type, the protobuf reading code needs to be able to handle both packed and unpacked encodings. Currently, the generated protobuf message reading code dispatches on a combination of field tag and encoding method. Instead, it should dispatch only on field tag, and the code for a packable field should then look at the encoding method and handle both the packed and unpacked case. I have some code on a branch that's closer to what's required ... I'll try to commit it soonish.

brown commented

I have implemented packed repeated fields. I have not tested your specific problem, but it's likely to be fixed. Please give it a try and let me know.

Confirmed, this is now fixed with your changes on the master branch. Thanks

brown commented
brown commented

That's great news. I will update this week--thanks for the note.