streamdal/plumber

Support to specify multiple directories via `--protobuf-dir`

arunk-s opened this issue ยท 14 comments

Hey,

Thanks for this great tool !

I'm trying to do the read messages in protobuf format from google pubsub, however I noticed that since some my protobuf definitions import extra things like http annotations which are either are fetched from a vendor folder or from the system library.
So I see something like this

FATA[0000] Unable to complete command: unable to find root message descriptor: unable to read file descriptors: unable to parse files: myservice/service.proto:27:8: google/api/annotations.proto: file does not exist 

Would it be possible to specify these directories separately by just repeating the same protobuf-dir option ?

Hey there! I've never tried using .proto defs that are defined elsewhere/outside my project. What you're asking for sounds doable though - I'll play around with it a bit and let you know.

In the meantime, could you provide a sample .proto which references outside schemas? Also, in the case of the google http annotations - do you have those vendored locally in your project or do they reside elsewhere?

Yeah, I think multiple directories is probably an odd use case to support ๐Ÿ˜… .
But I think, this can be thought of as providing a functionality similar to -I option of protoc.
The google http annotations are vendored locally in my project and I pass them via -I option of protoc.

Let's consider the project directory with the following two protobuf files:

service.proto is defined as:

syntax = "proto3";
package example;

import "google/api/annotations.proto";
import "message.proto";

option go_package = "github.com/arunk-s/sample_protobuf";

service YourService {
 rpc Echo(StringMessage) returns (StringMessage) {
   option (google.api.http) = {
     post: "/v1/example/echo"
     body: "*"
   };
 }
}

message.proto is defined as:

syntax = "proto3";
package example;

option go_package = "github.com/arunk-s/sample_protobuf";

message StringMessage {
  string value = 1;
}

Now, I've the http annotations vendored as

โ”œโ”€โ”€ message.proto
โ”œโ”€โ”€ service.proto
โ””โ”€โ”€ vendor
    โ””โ”€โ”€ github.com
        โ””โ”€โ”€ grpc-ecosystem
            โ””โ”€โ”€ grpc-gateway
                โ””โ”€โ”€ third_party
                    โ””โ”€โ”€ googleapis
                        โ”œโ”€โ”€ LICENSE
                        โ”œโ”€โ”€ README.grpc-gateway
                        โ””โ”€โ”€ google
                            โ”œโ”€โ”€ api
                            โ”‚   โ”œโ”€โ”€ annotations.proto
                            โ”‚   โ””โ”€โ”€ http.proto
                            โ””โ”€โ”€ rpc
                                โ”œโ”€โ”€ code.proto
                                โ”œโ”€โ”€ error_details.proto
                                โ””โ”€โ”€ status.proto

One copy of vendored files are available here.

Now the usual flow of generation looks like:

protoc -I/usr/local/include -I. \
  -Ivendor/github.com/grpc-ecosystem/grpc-gateway/third_party/googleapis \
  --go_out=plugins=grpc:. \
  service.proto

protoc -I/usr/local/include -I. \
  -Ivendor/github.com/grpc-ecosystem/grpc-gateway/third_party/googleapis \
  --go_out=plugins=grpc:. \
  message.proto

Right now, the vendored definitions are not picked by plumber and we see the error as mentioned above.

I hope this helps :)

Thank you for the detailed report - this is super helpful. I'm going to work on this and provide an update soon.

Hey there @arunk-s - PR is open; should get it merged sometime tomorrow after review. Feel free to try out the branch until then.

EDIT: BTW, super easy update - glad we have it!

This is great! ๐Ÿš€

Thanks for the swift update :)

This has been added in v.0.5.0

@arunk-s can you test out the release and see if it addresses the issue you had?

Yeah, I tested the new release and the feature works for sure. ๐Ÿš€

It somehow doesn't quite capture my particular use case but I think that's perfectly fine. ๐Ÿ˜…
I looked around the code that reads the proto files and I think that the cause of the problem my current structure faces is due to the fact that, the import paths/file paths are keyed on their relative path within the directory[--protobuf-dir](which makes sense).

But the imports in my source files are specified like import "google/api/annotations.proto"; and not import "vendor/github.com/grpc-ecosystem/grpc-gateway/third_party/googleapis/google/api/annotations.proto"; which creates the problem that I saw initially.

Specifying the vendor directory separately via --protobuf-dir also doesn't work, because there are now two keys for protoparse to read from: [ vendor/..../annotations.proto and google/api/annotations.proto ] and thus for the second definition it complains that the protobuf Message Type is already defined.

So, I went ahead and modified the code to replace the keys of vendored files with their prefix removed so protoparse wouldn't complain. It's a dirty fix and all hardcoded, but it works ;)

I was mistaken earlier when I thought that the option to provide a separate directory would work for my problem.
It isn't exactly the same as -I flag for protoc as the compiler picks the first file matching the import path and doesn't complain of it being redefined. We shouldn't assume that as well since plumber uses protoparse.

The remapping of the import paths is provided via another flag -M<filename>=<import path> of protoc, but adding that to plumber wouldn't be straightforward IMO as it relies on protoparse and I couldn't find anything that can provide this behavior out of the box. Also, probably the feature itself is not worth the effort at this time ๐Ÿ˜…

Thanks again for the swift resolution of this case !

Even if the fix doesn't quite solve the initial issue, I think it is a good feature for others and would certainly be useful ๐Ÿ™‚

Hey Arunk! Thanks for the thorough reply and for the insight & research - super helpful.

Everything you found makes sense and it sounds like you found a workaround, even if it's not very elegant.

As for protoc-like module mapping - maybe we could introduce some method to strip part of the path for --protobuf-dir? I'm thinking something like --protobuf-dir-rewrite 'vendor/github.com/grpc-ecosystem/grpc-gateway/third_party/googleapis/google/api google/api'?

That way, plumber would still use the correct paths for searching for proto files and protoparse won't freak out.

That doesn't seem like a big lift.

Oh, yeah, that is an option as well and I think that should work. I didn't quite think about it.

It is a bit different protoc -M flag in that protoc uses it to remap the import paths in the generated code, but plumber isn't concerned with that and we would use it to better infer the import paths.

Maybe the option name like --protobuf-dir-remap would be better ?

Yep, totally, --protobuf-dir-remap makes sense to me. Let me give it a spin over the weekend - should be simple :)

Sorry it took so long to get around to this. @blinktag wrote a PR for this but I wasn't able to get it working. Can you confirm whether this is still of interest? I re-read the issue and it still makes sense but because of all the "relative-dir-hackery", it is a bit of a pain to work through. LMK.

No worries. If you think it's harder then I'll leave the decision to you :)
For now, I have found a workaround so I think if there are more people interested we can rethink later.

Thanks again ๐Ÿ™Œ

Got it - thanks @arunk-s. I'll close the issue and put a ticket in our internal icebox. Talk soon!