opentracing-contrib/nginx-opentracing

Opentracing and proxy_cache_use_stale causing lots of log spam

jzyeezy opened this issue · 1 comments

Context

We are noticing a large amount of opentracing error logs for routes that have caching enabled. Example:

2020/12/15 22:39:01 [error] 13#13: *123456789 failed to expand opentracing_context_ for request REQUESTID123: lookup_span_context_value failed: could not find request trace, client: X.X.X.X, server: myserver.com, request: “GET /resource/id HTTP/1.1”, subrequest: “/resource/id”, host: “myserver.com”

which we noticed that these errors happen whenever NGINX is serving up STALE cache responses:

[15/Dec/2020:22:39:01 +0000]  “GET /resource/id HTTP/1.1" 200 1048 “App Client” <...> STALE

Here's what our NGINX config looks like and the specific route:

location / {
    opentracing_tag method $request_method;
    opentracing_tag uri $uri;
    opentracing_propagate_context;

    proxy_cache_background_update on; # this must be on in order to use proxy_cache_use_stale below
   
    proxy_pass http://backend;
  
    location ~ /resource/\w+$ {
      proxy_cache_methods GET;
      proxy_cache_bypass $bypass_cache;
      proxy_ignore_headers Cache-Control;
      proxy_cache_valid 200 60s;
      proxy_cache_use_stale error updating http_500 http_502 http_503 http_504;
      proxy_cache_lock on;
      proxy_pass http://backend;
    }
}

Furthermore, when we disabled proxy_cache_use_stale error updating http_500 http_502 http_503 http_504; from this route, we stopped seeing these opentracing errors. We need to leave the proxy_cache_use_stale directive enabled, so this is not an acceptable solution for us.

Questions

  • Our running theory is that it seems like an http request that nginx can find a stale hit in it's cache triggers "a background subrequest to update an expired cache item, while a stale cached response is returned to the client." We think this background subrequest is what is triggering the error. a stale trace is on the cached response and the opentracing controller cannot tie it back to the current request. Is that completely off base? 😅
  • Is there a way to disable or not attempt to expand the opentracing context when NGINX is issuing a background subrequest? is going to serve a stale cache hit?

On further investigation, it appears that the errors go away if I add a custom header that we add within the route. I can't speak at all as to why this seemed to work though...

location / { 
   # ... the same directives as my original description
   proxy_set_header My-Custom-Header $my_custom_variable;
   location ~ /resource/\w+$ {
      proxy_set_header My-Custom-Header $my_custom_variable;
      # ... the same directives as my original description
    } 
}