moleculerjs/moleculer-apollo-server

Using Moleculer Apollo Server in a NATS configured environment with multiple replicas.

Closed this issue · 7 comments

I stumbled upon this issue today and I did a bit of debugging.

Ill try to simplify the setup for you. I have three nodes where two runs the apollo server and the last one just regular moleculerjs.

With one instance of apollo-server and moleculerjs everything runs just fine, but when I introduce another apollo-server instance in to the pool the last one added wont start up and starts throwing errors about not being able to compile the graphl schema -- which is at first a bit weird since one instance is already running fine and doing its job. So after a bit of debugging I have over this line right here:

https://github.com/moleculerjs/moleculer-apollo-server/blob/master/src/service.js#L377

And from what I can see this means that generateGraphQLSchema would actually try to generate a schema based on the already running instance of apollo-server. Is that by design?

I tried adding modifying the guard to be:

if (processedServices.has(serviceName) || !service.local) return; so we only generate schemas based on local services, and that works for me.

Bug or did I do something weird here?

@shawnmcknight what do you think? Does it work for you with multiple apollo-server?

We run with multiple instances of moleculer-web (and thus, moleculer-apollo-server) in our deployed environments and use NATS for transport. @abdavid I'm guessing you either have something set up wrong or you have a valid case that we're not using. Can you make a sample repro?

@abdavid I'm trying to do a little guesswork based on your initial post. Are you trying to run replicas that contain both the moleculer-web gateway as well as other services? I'm thinking that might cause a problem if you have a service defined locally with the moleculer-web instance as well as a remote version of that same service. We run the gateway standalone and all services are remote to the gateway.

Sorry for the delay in response here -- times are hectic.

@shawnmcknight
While true I think it best if I draw you a diagram, and you might also be right about something not being used as intended since the moleculer-server-apollo almost always starts with throwing errors about not being able to build the schema initially, and after a few seconds (looks like service dependencies are avaliable) it recompiles and everything is fine.

Simplified diagram

While developing these new services I have chosen to use the monolithic approach to start off with since this requires less configuration when deploying to k8s. What I want to achieve is to have two replicas of the moleculer-apollo-server so I can scale the monolith horizontally. What I experience when I try to do that locally is that the moleculer-apollo-server that is already running has no problems, but the newly created instance will fail when trying to generate the schema due to it identifying non-local (to that instance) services that includes the graphql configuration and thus when parsing that configuration errors will get thrown due to the fact that the same service already has been processed and duplicate types then occurs. The hack I did basically just ensures that the moleculer-apollo-server will not parse external services, only local services. Is this issue due to some services being available through the transporter?

So I think we can start with getting one thing out of the way:

The hack I did basically just ensures that the moleculer-apollo-server will not parse external services, only local services.

This would be counter to the goals of moleculer-apollo-server in that each remote service should have its schema stitched into one composite GraphQL schema available at the gateway. Therefore, we can't possibly skip remote services as those services would then become inaccessible in the schema.

In looking through the workflow of the schema generation, its not really clear to me how a problem would be occurring. The new replica should start up, and each service should be discovered (local and remote). In generateGraphQLSchema, each service is processed one at a time, building the schema. When a service gets processed, it gets added to a Set with the service's full name (version + service name) the key within the Set here. If the full service name is encountered any additional times, it should be skipped and nothing is added to the generated schema. Since the service's identifier doesn't take which node its running on into account, it should be irrelevant whether that service is running locally or remotely to the moleculer-apollo-server instance.

None of this is to say that you aren't having a problem, I'm just not able to visualize how an issue could be cropping up. I can make two guesses without a reproduction as to the cause, however, I'm just spitballing here.

  • There is some duplication of a service's fullname with different GraphQL configurations. That is, two services exist somewhere in the group of nodes, but they have different GraphQL configurations from each other. That would cause, depending on sequence of service processing, the potential for the GraphQL schema to be built differently on the new instance of moleculer- web than it did the previous time.
  • The second replica of moleculer-web isn't a true replica, but is being launched as a service with a fullname (name or version) that is different from the original, but the GraphQL configurations have overlap with each other. Therefore, what would happen is both the local services and the remote services would get processed, and effectively duplicate GraphQL types into the schema that is being generated, which would cause a failure. A GraphQL type must have a single representation or else the schema cannot be built. If the same GraphQL type is present multiple times, but each time it is present is within a replica of a service with the same name and version (i.e. fullname), then it should be skipped in the code I laid out earlier which would avoid the duplication.

As I said before, this is just guessing. In our environment we definitely have multiple instances of moleculer-web/moleculer-apollo-server and they have their own locally defined schema definitions even if there aren't other services running on the same node. We don't have a problem with schema generation as the remote service's of the same name are skipped during schema generation.

Is there any way you can provide a stripped down repro that shows the problem? I'd like to help, but I'm not sure how far I can take this without seeing your actual setup.

No activity on this in several months. Closing for now.

Sorry for the seriously late reply - I have been hammered with project deadlines and life in general (like most of us).

I have some new insight that might shed some light and I am not certain that it can be resolved without me having to rewrite a bunch of resolvers.

So tonight I have been digging deeper -- I have been playing around locally with the issue. What I have done is split up the moleculer-apollo-server gateway so it runs in its own process. After that I started up the project without the gateway so my services with their respective graphql configuration then would register (using TCP transporter while testing).

My finding (which is also logical when I think about it) is that I have a problem with my resolvers. Some of my resolvers are so simple that I did not use the moleculer-apollo-server "norm" of mapping the resolver to a service, I rather wrote the resolver as an inline function e.g:

....
id: source => source.customerNumber,
....

And that of course it not serializable for the transporter, so when my moleculer-apollo-server parses the services in the registry to identify and configure the schema - resolvers which are inline functions does not exist. And I also now see why my fix works because when I start the project and everything i running locally the registry actually has the resolver functions.