LoadMessageDescriptorForMessage bug
Shukla-Ankur opened this issue · 6 comments
I have encounetered following bug while using Confluent-kafka-go v1.9.2 which uses this package.
jhump/protoreflect v1.12.0
Confluent-kafka-go v1.9.2
I am running a grpc server in go language. The APIs are defined using protobuf.
Clients send data in a generic API where input object is defined as:
message Document {
DocumentMeta meta = 1;
bytes data = 2;
google.protobuf.Any details = 3;
}
The details field uses Any type so that clients can send any type of message using this API. On server-side, the .proto files are used to register Message types
protoFile, err := ioutil.ReadFile(tmp_file)
pb_set := new(descriptorpb.FileDescriptorSet)
if err := proto.Unmarshal(protoFile, pb_set); err != nil {
return err
}
pb := pb_set.GetFile()[0]
fd, err := protodesc.NewFile(pb, protoregistry.GlobalFiles)
msgType = dynamicpb.NewMessageType(fd.Messages().Get(0))
protoregistry.GlobalTypes.RegisterMessage(msgType)
The types get registered successfully and I can check them.
desc.LoadMessageDescriptorForMessage(msg)
step fails with error
error interface conversion: *dynamicpb.Message is not desc.protoMessage: missing method Descriptor
msg of type protoV1.MessageV1 is derived from a protoreflect.ProtoMessage.
How to reproduce
Run the above steps, it can be deterministically reproduced.
-
Create an API with Any type field
-
Create a proto file on client-side,
-
server will use the proto file from client to register message descriptors
-
Client will use proto file to send data via above API
-
On server-side, Unmarshal the data from Any field like anypb.UnmarshalNew(doc.GetDetails(), proto.UnmarshalOptions{})
-
run desc.LoadMessageDescriptorForMessage(msg)
@Shukla-Ankur, FYI, there is a big "Note" on the main README for this repo that this was implemented against the "v1" of the protobuf runtime (in "github.com/golang/protobuf" import paths).
The dynamicpb
package is in the "v1" of the runtime (in "google.golang.org/protobuf" import paths). The two are not entirely compatible.
You might try using proto.MessageV1
(in the "v1" APIs), but I don't know if that will work. (Even if the function returns something, vs. blowing up at the sight of a dynamic message, it still may not work with this repo.)
The desc.LoadMessageDescriptorForMessage
is expecting a generated message struct, not a dynamic message (though it will work correctly if you instead using the dynamic message implementation in this repo, in "github.com/jhump/protoreflect/dynamic").
To rectify the issue, I tried the following changed approach:
- Use msgregistry.MessageRegistry to register descriptors since this uses V1 Message type.
ty := fds[i].GetMessageTypes()[0]
url := msgR.ComputeURL(ty)
msgR.AddMessage(url, ty)
- Unmarshal using above registry. This should give MessageV1 types
msgR.UnmarshalAny(doc.GetDetails())
This resolves the above error.
- However, if I convert this V1 message to V2 type, and pull out parent descriptor using
protoMsg.ProtoReflect().Descriptor().Parent()
parent is nil. During registration, the parent of the message was NOT nil.
During registration, I captured the following details.
url : type.googleapis.com/com.test.abcd.pb.cat.Cat
,
MessageDescriptor: name:"Cat" field:{name:"name" number:1 label:LABEL_OPTIONAL type:TYPE_STRING json_name:"name"} field:{name:"id" number:2 label:LABEL_OPTIONAL type:TYPE_INT64 json_name:"id"} field:{name:"breed" number:3 label:LABEL_OPTIONAL type:TYPE_STRING json_name:"breed"}
parent: name:"/Users/testuser/Go/src/github.com/test/EventBus/protos/events/EventV1.proto" package:"com.test.abcd.pb.cat" message_type:{name:"Cat" field:{name:"name" number:1 label:LABEL_OPTIONAL type:TYPE_STRING json_name:"name"} field:{name:"id" number:2 label:LABEL_OPTIONAL type:TYPE_INT64 json_name:"id"} field:{name:"breed" number:3 label:LABEL_OPTIONAL type:TYPE_STRING json_name:"breed"}} options:{go_package:"github.com/abcd/abcd/pb/cat"} syntax:"proto3"
I have 2 ways to unmarshal Any- using protoregistry or msgregistry.MessageRegistry. 1st one uses V2 types and 2nd uses V1 types. jhump/protoreflect likes to use V1 type, so msgregistry should be preferred.
- With so many types floating around, it has become really confusing. I am not sure what should be done differently in message registration or type conversion (v1 ->v2) to correct this.
- Regarding your suggestion on using github.com/jhump/protoreflect/dynamic, I believe I need to be able to use this during message descriptor registration itself. Can you please point to some example usages? As I mentioned above, during unmarshal the parent FD is nil with the approach I am taking
Regarding your suggestion on using github.com/jhump/protoreflect/dynamic
@Shukla-Ankur, the msgregistry
package should automatically do this (use the dynamic message implementation in this repo) if the given type URL is not a compile-time known message type.
However, if I convert this V1 message to V2 type, and pull out parent descriptor ...
I have no idea about this. The actual descriptor you get back from desc.LoadMessageDescriptorForMessage
for the resulting message certainly does have a file descriptor parent. I'm not sure if this is a bug with the MessageV2
function or perhaps you are assuming the conversion will try to retain more than it does. Why do you need the file?
Maybe we can back up a little: why do you need to use LoadMessageDescriptorForMessage
at all? If you are using the v2 APIs, just use their descriptor functionality as well. This repo was designed before the v2 API existed and is meant to be used with the v2 API. Is there something else in this repo you are trying to use that requires the use of desc.Descriptor
instead of protoreflect.Descriptor
?
While there is a plan to migrate this repo to a v2, which will directly support and interop with the new APIs, that v2 will not include the descriptor and dynamic message functionality (since that is now handled by the protobuf runtime API).
Can you help me understand how can I convert desc.MessageDescriptor to protoreflect.Descriptor? That should solve the issue here?
If that can not be done directly, protoreflect.Descriptor has Index() method. Is there a way to find corresponding Index using desc.MessageDescriptor ?
As for your suggestion on using descriptor of V2, I think confluent used V1 since your library provides rich descriptors, making it easy to derive schema.
This is the implementation within Confluent-kafka-go. I am attaching the below snippet from that repo. If you look at the switch case, this code expects protoreflect.ProtoMessage type, then inside toProtobufSchema() uses desc.LoadMessageDescriptorForMessage() by converting the protoMsg to v1.
func (s *Serializer) Serialize(topic string, msg interface{}) ([]byte, error) {
if msg == nil {
return nil, nil
}
var protoMsg proto.Message
switch t := msg.(type) {
case proto.Message:
protoMsg = t
default:
return nil, fmt.Errorf("serialization target must be a protobuf message. Got '%v'", t)
}
autoRegister := s.Conf.AutoRegisterSchemas
normalize := s.Conf.NormalizeSchemas
fileDesc, deps, err := s.toProtobufSchema(protoMsg)
if err != nil {
return nil, err
}
metadata, err := s.resolveDependencies(fileDesc, deps, "", autoRegister, normalize)
if err != nil {
return nil, err
}
info := schemaregistry.SchemaInfo{
Schema: metadata.Schema,
SchemaType: metadata.SchemaType,
References: metadata.References,
}
id, err := s.GetID(topic, protoMsg, info)
if err != nil {
return nil, err
}
msgIndexBytes := toMessageIndexBytes(protoMsg.ProtoReflect().Descriptor())
msgBytes, err := proto.Marshal(protoMsg)
if err != nil {
return nil, err
}
payload, err := s.WriteBytes(id, append(msgIndexBytes, msgBytes...))
if err != nil {
return nil, err
}
return payload, nil
}
This code does not work with protoregistry.GlobalTypes.RegisterMessage(msgType) since it converts Any to dynamicpb.Message, but does work until GetID() with messageRepository.AddMessage() since this converts Any to MessageV1.
The issue comes in toMessageIndexBytes() which does a .parent() on descriptor and this parent is always nil with V1 underlying message.
@Shukla-Ankur, it might be better off to expand the switch
to support both the V1 and V2 versions of proto.Message
and convert the other direction: use proto.MessageV2
. With the V2 API, there is no need to use desc.LoadMessageDescriptorForMessage
at all; use msg.ProtoReflect().Descriptor()
instead. Then you're working with the V2 APIs model of a descriptor instead of the one in this repo, but it should not be difficult to modify logic to work with that model instead of this one. I think that will be the most robust way to deal with this.
ok, so I need to update the confluent's SDK and basically understand the usage for V2 type's descriptors. Thanks