nicholasjackson/fake-service

Parallel requests causing service to hang.

Closed this issue · 5 comments

When running multiple parallel requests the service actually hangs from time to time.
It seems to occur when the hit the service at the same time.

After importing and changing mux to gorilla/mux the issue disappeared.
We noticed it when trying to load the service while running health checks at the same time.

Hey @andreas-ahman I need to look into this the standard net/http mux should not have any locking on it all it does is provide a http.Handler that is executed by the server thread.

gorilla/mux performs the same function however has more utility such as the ability to filter HTTP Methods, etc. Under the hood they both use the given request and return an instance of http.Handler, this is then executed by the http.Server for the incoming request.

Is it possible that your load testing was overloading the service instance and this was the reason that the health checks were not responding? The health check and the main request path both share CPU and memory and the connection limitations of the HTTP server. If the server is busy with requests on the / path then it will also be unable to server /health.

Hi,

I doubt that the load testing was overloading the service instance.
We're health checking towards / and the health check hangs at the same time as our transaction.

Might be worth mentioning that it hangs forever unless a timeout is specified.

After switching to gorilla/mux we do not see the same behavior. Even with the same load which should rule out that the service instance is overloaded.

We only see the timeouts when the requests arrive at the same time, according to logs.

Please let me know if you want me to do any testing.

@andreas-ahman Can I ask if you were using Fake Service as an application in Consul service mesh when you were experiencing those problems?

I noticed something similar myself that was down to envoy proxy hanging sometimes when connecting to fake service. Googling around this there were articles that hinted that this was to do with HTTP_KEEP_ALIVES, even with these disabled I still got this issue, implementing more sane Timeouts for the HTTP server seemed to solve the problem. However, I am not completely convinced that it was not the upgrade to a later go version and thus net/http package that actually solved the problem.

69bad0a

Could you try the latest version of Fake Service to see if this helps, this version also has both HTTP and gRPC endpoints exposed on the same port. There is no need to set the type of the service anymore.

Yes, we're using it as an application on Consul service mesh.

I've tried with the latest version and I'm no longer able to duplicate the issue.

Awesome @andreas-ahman it sounds like you hit the same Envoy issue I did.