deis/router

nginx: [error] open() "/tmp/nginx.pid" failed (2: No such file or directory)

monaka opened this issue · 6 comments

I find error like this on quai.io/deis/router:v2.4.0. Is this harmless?

2016-10-07 05:19:00.064200 I | INFO: Starting nginx...
2016-10-07 05:19:00.100071 I | INFO: nginx started.
2016-10-07 05:19:04.004764 I | INFO: Router configuration has changed in k8s.
2016-10-07 05:19:04.318212 I | INFO: Reloading nginx...
2016-10-07 05:19:04.324293 I | INFO: nginx reloaded.
nginx: [error] open() "/tmp/nginx.pid" failed (2: No such file or directory)

I've never seen that before and I have trouble imagining how it happened. Are there more log entries that follow that?

In the best case, this is fatal to Nginx, liveness probes begin failing, and k8s restarts the router pod. But I don't think a failed reload stops Nginx from serving requests using its previous configuration. (I'd have to research a bit to verify that.)

In the worst case (believe it or not), Nginx continues serving requests...

See following lines:

router/router.go

Lines 53 to 54 in b33c18c

nginx.Reload()
known = routerConfig

Because we're not capturing whatever errors are returned from nginx.Reload(), the (new) computed config model becomes the "known" config model regardless of whether the reload worked or not. This would mean the router program and Nginx would no longer agree on the current state. On subsequent builds of the model (at least until something else changes), the router would believe the desired state is the current state, even though Nginx is still running old configuration. This is clearly a bug.

Note the bug doesn't explain your problem, nor will fixing it fix your problem, but it will at least prevent the worst case scenario. I will open and issue for this.

I will be reproduced on restarting, when Router detects the K8s event before booting Nginx up.
I suspect this issue is rare case.
It's differ from #272. But simlar issue around reloading failure.

I will be reproduced on restarting, when Router detects the K8s event before booting Nginx up.

I'm not sure I understand here. The sequence of events is always the same at startup.

  1. Nginx starts with trivial configuration
  2. The control loop begins-- this will:
    1. Build the model
    2. Use the model as input to a template; the output is new Nginx config
    3. Reload Nginx

There is no scenario where these things happen in a different sequence.

I'm also not sure but it may be reproduced not always (means "rare case").
Just guess but I think the sequence is like this.

  1. A something modification. k8s event was fired.
  2. Router was restarted regardless of 1. (may almost the same time)
  3. Nginx was booting up.
  4. Router accepts the event. But Nginx is still booting up.
  5. Router requests reloading. But Nginx can't find /tmp/nginx.pid as it is still booting up.

@monaka your theory about what causes this relies on some false assumptions about how router works. Router does not (currently) watch the k8s event stream (although #274 proposes we start doing that).

Are you able to articulate precise steps for reproducing this issue? So far, I have not found this to be reproducible, which makes it really hard to troubleshoot.

I think it's too hard to reproduce as this must be a timing issue.
And it may be already fixed by #279.
(This issue may be reproduced but Router will retry for recovering)

I close this issue for now and will reopen in case my assumption from #279 is false.