Zipkin stacktrace errors on proxy leveraging request termination plugin on Kong 1.0
jeremyjpj0916 opened this issue · 7 comments
Error is like so:
2018/12/25 04:45:56 [error] 38#0: *5243172 failed to run log_by_lua*: .../local/share/lua/5.1/kong/plugins/zipkin/opentracing.lua:40: attempt to index local 'addr' (a nil value)
stack traceback:
.../local/share/lua/5.1/kong/plugins/zipkin/opentracing.lua:40: in function 'ip_tag'
.../local/share/lua/5.1/kong/plugins/zipkin/opentracing.lua:202: in function 'log'
/usr/local/share/lua/5.1/kong/plugins/zipkin/handler.lua:31: in function 'log'
/usr/local/share/lua/5.1/kong/init.lua:751: in function 'log'
log_by_lua(nginx.conf:149):2: in function <log_by_lua(nginx.conf:149):1> while logging request, client: 10.129.52.1, server: kong, request: "GET /F5/status HTTP/1.1", host: "gateway-dev-core-datacenter.company.com"
A false assumption has been made that addr will always be non-nil for this code:
https://github.com/Kong/kong-plugin-zipkin/blob/master/kong/plugins/zipkin/opentracing.lua#L40
Looks like it's failing from the call-site for the load balancer data.
My guess is that a DNS lookup failed, so the ip
is nil
and it went to the next balancer candidate? Could you check if that was the case by seeing if you have a DNS failure earlier in your logs?
Either way it's a bug that needs fixing in this plugin, thankyou for the report!
@james-callahan the service this plugin errors on is set to proxy to http://localhost:8001/status (Kong admin api status endpoint), but it will never do so anyways because the request termination plugin is enabled on this particular kong route. I would think the DNS library would treat localhost resolution as ip 127.0.0.1 but somehow it must be nil(most likely because request termination short circuits everything so balancer was never involved, just a guess).
I would think the DNS library would treat localhost resolution as ip 127.0.0.1
It didn't until recently. See Kong/lua-resty-dns-client#50
Good idea on the request termination short circuiting, will investigate there...
@james-callahan I did test that PR just now too on 1.0 and the lua nil pointer exceptions are gone so that did do the trick for short circuit tx's(and also validated no further places needed it I missed
https://luarocks.org/modules/kong/kong-plugin-zipkin does not seem updated yet after the PR was merged. Guess yall are running that manually?
It's merged but not released.
The next plugin release will be done in sync with the next kong patch release.
All righty, sounds good to me I will just drop in the zipkin code manually for now as opposed to luarocks installing until the next official Kong release(hopefully this month to address some problems I see on 1.0 :P ).