protocolbuffers/protobuf

TextFormat for C#

Oceania2018 opened this issue ยท 10 comments

What language does this apply to?
proto3
C#

Currently C# does not have the functionality of TextFormat, like this:
https://developers.google.com/protocol-buffers/docs/reference/java/

I've found a version but it's not in .net standard:
https://github.com/jskeet/protobuf-csharp-port/blob/master/src/ProtocolBuffers/TextFormat.cs

Not supporting text-format in the Google.Protobuf C# implementation is deliberate.
Google.Protobuf supports canonical mapping to JSON, which should be good enough to satisfy your needs as it provides mostly the same functionality as textFormat. The text format support was more useful before canonical JSON mapping support was added in protobuf 3. (which is also when Google.Protobuf was created and we didn't want to bloat the codebase so we never added test format support for Google.Protobuf).

Btw, "protobuf-csharp-port" is very old, I'd recommend using Google.Protobuf (basically its successor) and the JSON format instead of text format.

The reason we need this feature is because TensorFlow models is using TextFormat to define pipeline, and TensorFlow checkpoint is also using this format. https://github.com/tensorflow/models.

This issue blocks our work, my team is working on making a .NET binding for TensorFlow. https://github.com/SciSharp/TensorFlow.NET.

Our solution is fork and migrate the old one to protobuf: https://github.com/SciSharp/protobuf/commits/master. Is it possible to merge if we complete this feature?

How much code/complexity would be to add the TextFormat support to the current Google.Protobuf implementation? I think we can accept a contribution assuming it's not overly complex and it's tested accordingly. Ideally I'd like to see a design (what APIs are going to be added) and a list of work needs to be done to make TextFormat support in C# fully functional.

@jtattermusch We have preview version for our own project. we will refactor the code and try to make it more clean before we PR.

https://www.nuget.org/packages/Protobuf.Text/0.1.0

@jtattermusch Yes, the text format support of Protobuf.Text right now just follow the design of current protobuf's JsonFormat implemention with TextTokenizer, TextParser and TextFormatter. And we duplicated test cases from JsonFormat and we are working on making all the test cases work.
Generally, two parsers should only have a few differences, because they just merge document nodes tree to model. But there are more differences between two tokenizers which are used for loading the nodes tree from string. The project Protobuf.Text is an addition for protobuf C# library for now. But later, we will send a PR to protobuf after we do some sorts of refactoring and the code reaches higher quality level.

BTW, the most of the code of TextFormatter is from protobuf-csharp-port.

@jtattermusch Hello Jan, are you the major contributor of protobuf C# version? We have some progress for the project https://github.com/SciSharp/protobuf.Text. And I am going to has a discussion about design of JsonParser and TextParser with the major contributors.

@kerryjiang I can review the design but it might take a while because TextFormat support is strictly speaking not a necessity - just a nice to have. Other folks to involve are @anandolee (knowledgeable about implementation in other languages) and @jskeet (the original author of most of Google.Protobuf code, but most likely won't have cycles to review in detail).