dart-lang/labs

`/readiness_check` occasionally returning 503 service unavailable

Opened this issue · 4 comments

My app is restarting without context. Looking deeper into the logs, I see many requests for /readiness_check are returning 503, which seems to eventually cause the container to kill the app:

A 2019-08-10T02:04:44.874530055Z Waiting for termination, 1 seconds remaining. 
A 2019-08-10T02:04:44.809396670Z Sending SIGKILL to app. 
A 2019-08-10T02:04:44.805334192Z 50b2e7dca1284e0309c09ec3e19d2de195c2d4b5b84456ddeebf5324cbb654d8 
A 2019-08-10T02:04:44.738550379Z <13>Aug 10 02:04:44 vm_runtime_init: Aug 10 02:04:44 Pure compute version. Should skip _ah/start or stop. 
A 2019-08-10T02:04:44.731097521Z Sending SIGTERM to app. 
A 2019-08-10T02:04:44.729567774Z Aug 10 02:04:44 Pure compute version. Should skip _ah/start or stop. 
A 2019-08-10T02:04:44.677616932Z Triggering app shutdown handlers. 
A 2019-08-10T02:04:44.543Z GET 503 10 B 0 ms GoogleHC/1.0 /readiness_check GET 503 10 B 0 ms GoogleHC/1.0 

@jonasfj @dnfield

I've seen a few notifications about health checks, which may be related. I think we're off the legacy checks but maybe not

The docs (https://cloud.google.com/appengine/docs/flexible/custom-runtimes/configuring-your-app-with-app-yaml) say that by default, readiness checks will not be forwarded form the container to your app - but they don't say what is responsible for responding in those cases.

I'm trying to update the service handlers to be aware of the new request paths (and to correspondingly update my app.yaml to manually handle the health checks).

https://github.com/dart-lang/appengine/blob/561deb1ee8fe1330fd7c42fbbde203acbdd54b03/lib/src/server/server.dart#L43-L47

You can do a gcloud app describe to check if you have split health checks enabled (you probably do, unless the app is old).

In pub.dev we handle the health checks manually, and specify an end-point they should hit. But according to the docs, this shouldn't be necessary.

I tried updating my app.yaml to handle them manually, and the health checks seemed to be coming far more frequently than I had specified... but you're right - it shouldn't be necessary...