envoyproxy/go-control-plane

Multiple listeners is not working with Golang XDS Client

asishrs opened this issue ยท 17 comments

I was trying to use multiple listens with Go lang XDS client. According to the go documentation, gRPC client must be tolerant of servers that may ignore the resource name in the request. I wasn't able to make that work and while debugging, I found that the code is expecting clients to send all resources names in the request. The relevant code for that check is

if _, exists := names[resourceName]; !exists {

According to the Go doc, the resource name is the server name hence it is not possible to send all resources in the request.

From Go doc -

The gRPC client will typically start by sending an LDS request for a Listener resource whose name matches the server name from the target URI used to create the gRPC channel (including port suffix if present).

I modified the code as below and it worked as expected. I am open to submit a PR, let me know.

func superset(names map[string]bool, resources map[string]types.Resource) error {
	for resourceName := range resources {
		if _, exists := names[resourceName]; exists {
			return nil
		}
	}
	return fmt.Error("resource is not listed)
}

It should handle 0 resource names here

if len(request.ResourceNames) != 0 && cache.ads {
. LDS and CDS are always requested in bulk. I think we exercise that in "xds" integration tests.

LDS and CDS are always requested in bulk.

is this a design decision? It seems the gRPC xDS client doesn't consider that. Based on the documentation (noted below), the client uses gRPC server name in the connection string as LDS resource name.

The gRPC client will typically start by sending an LDS request for a Listener resource whose name matches the server name from the target URI used to create the gRPC channel

Also, it is not possible for each client to be aware of other gRPC services

Delta xDS is not yet implemented here, so CDS/LDS are state-of-the-world at the moment. We haven't validated that gRPC can actually work with this server implementation.

Using gRPC does not require using the incremental xDS protocol variants. It's just using a non-wildcard request for LDS and CDS, as per the xDS spec:

https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol#how-the-client-specifies-what-resources-to-return

The logic needed in the xDS server is not hard to implement. Basically, if the initial request on the xDS stream for a given resource type does not set the resource_names field, then it's a wildcard request; otherwise, the server should return the specific set of reources that the client requested. The only tricky part is that wildcard requests can only be used on the first request on the stream for a given resource type; if the client initially sends a non-wildcard request and then later sends a request with no resource_names field, that means that the client is unsubscribing from the last resource.

That having been said, I will also point out that even if a given xDS server supports only wildcard mode with LDS, it should work fine with a gRPC client, as long as one of the returned Listener resources has the name that the client requested. So I think all that you really need here is to make sure that one of the Listener resources returned by the xDS server has the name that the client is interested in.

@dfawley and @menghanl can comment on the gRPC-Go xDS client implementation.

@dfawley and @menghanl can you help with the gRPC-Go xDS client implementation?

@asishrs I don't have much time to work on this soon (plus that I also need to study the go-control-plane code). But I can try to fix when I finish some of the work at hand now.

@kyessenov, @asishrs, I don't think this is just a small fix. I believe the problem here is that go-control-plane does not support gRPC xDS client. As @markdroth explained above, gRPC client asks for a specific resource name in the LDS request, which in this case is nothing but the hostname[:port] in the client channel URI. go-control-plane needs to be able to return this resource in the LDS response otherwise gRPC client will throw a resolver error. Also, note that the LDS resource is not the standard ip:port based network listener. The go-control-plane has to create an API listener resource based on the domains specified in VirtualHosts. The logic has to handle duplicate domains across virtual hosts and wildcard matching.

I think this is better handled by someone familiar with go-control-plane design and would require good interop tests to ensure things don't break as xDS APIs evolve.

not sure if this covers the same thing but i put together a trivial standalone xds server that works with grpc clients
here using
github.com/envoyproxy/go-control-plane v0.9.4
google.golang.org/grpc v1.33.2

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

If I didn't miss anything, this is still an issue, and needs a fix.

I stumbled upon this issue. I was trying to connect service A to service B using GRPC built-in xDS client. It works fine if the snapshot in xDS cache only contains service B. When it contains services A or C as well xDS cannot respond correctly to A.

The error in go-control-plane is that expects the client to request all the services at once (here) , while it is neither possible nor desirable (and GRPC does not do that). As far as I understood the go-control-plane code, it's sufficient to check whether the requested listener is listed in the snapshot.

I modified the patch proposed by @asishrs in the original post to do just that:

// superset checks that all resources are listed in the names set.
func superset(names map[string]bool, resources map[string]types.ResourceWithTtl) error {
	for reqName, _ := range names {
		if _, exists := resources[reqName]; !exists {
			return fmt.Errorf("%q not listed", reqName)
		}
	}
	return nil
}

Please comment on whether this patch fixes the problem and does not break anything else.
Thank you.

What is the status of this? Seems like a reasonable bug report and trivial fix (maybe I just don't understand the remarks against merging this).

@grobza was there a reason for closing that PR? It seems like the integration tests fail with that change so maybe someone can look into that?

@alecholmez the tests were failing, and I had no time to dig deeper with no commentary from the authors, nor do I have time now.
I forked go-control-plane and applied my patch to it, and it works in the system I work on, although I will not claim it's a production-ready fork.

Apparently, this fxies the issue. It worked for me, at least (grpc 1.38 + go-control-plane 0.9.9)
grpc/grpc-go#5131 (comment)

Apparently, this fxies the issue. It worked for me, at least (grpc 1.38 + go-control-plane 0.9.9) grpc/grpc-go#5131 (comment)

worked for me, too.

Yeah, it seems set ADS flag to false fixed the problem.