Netflix/zuul

NullPointerException for some requests after 2.1.7 -> 2.1.9 upgrade

meetyourturik opened this issue · 2 comments

Hello,
after we've upgraded zuul version from 2.1.7 to 2.1.9 we've started having 502s returned for some requests

The main log message is the following:

logger_name: c.n.z.n.s.ClientResponseWriter
message: ClientResponseWriter caught exception in client connection pipeline: Channel: [id: 0x18e734fc, L:/10.255.0.12:8080 - R:/10.106.65.221:7742], active=true, open=true, registered=true, writable=true, id=18e734fc, Passport: CurrentPassport {start_ms=1665421851993, [+0=IN_REQ_HEADERS_RECEIVED, +25405=FILTERS_INBOUND_START, +605148=IN_REQ_LAST_CONTENT_RECEIVED, +8022162=FILTERS_INBOUND_END, +8472518=ORIGIN_CONN_ACQUIRE_START, +8474740=ORIGIN_CONN_ACQUIRE_END, +8501191=OUT_REQ_HEADERS_SENDING, +8525891=OUT_REQ_LAST_CONTENT_SENDING, +8581036=OUT_REQ_HEADERS_SENT, +8583789=OUT_REQ_LAST_CONTENT_SENT, +57036026=IN_RESP_HEADERS_RECEIVED, +57049009=FILTERS_OUTBOUND_START, +57203853=FILTERS_OUTBOUND_END, +57383477=NOW]}
stack_trace: j.l.NullPointerException: null
	at i.n.h.c.h.HttpUtil.isKeepAlive(HttpUtil.java:75)
	at c.n.z.n.s.ClientResponseWriter.buildHttpResponse(ClientResponseWriter.java:199)
	at c.n.z.n.s.ClientResponseWriter.channelRead(ClientResponseWriter.java:139)
	at i.n.c.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at i.n.c.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at i.n.c.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at c.n.z.n.f.BaseZuulFilterRunner.invokeNextStage(BaseZuulFilterRunner.java:151)
	at c.n.z.n.f.ZuulFilterChainRunner.runFilters(ZuulFilterChainRunner.java:88)
	at c.n.z.n.f.ZuulFilterChainRunner.filter(ZuulFilterChainRunner.java:56)
	at c.n.z.f.e.ProxyEndpoint.filterResponse(ProxyEndpoint.java:327)
	... 64 frames truncated

It seems to code from this line because nativeReq is null.
It's being retrieved from context by the CommonContextKeys.NETTY_HTTP_REQUEST key, and the only place where it's set seems to be here
I'm not sure what to make out of these discoveries, unfortunately

Also, if it helps to debug, these are other log entries for the same channel id:

logger_name: c.n.z.n.i.PassportLoggingHandler
message: Request processing took longer than threshold! toplevelid = 76982df9-c56a-463b-b4fb-f00db9d21c7d, Channel: [id: 0x18e734fc, L:/10.255.0.12:8080 - R:/10.106.65.221:7742], active=true, open=true, registered=true, writable=true, id=18e734fc, Passport: CurrentPassport {start_ms=1665421755721, [+0=IN_REQ_HEADERS_RECEIVED, +42807=FILTERS_INBOUND_START, +6862128=FILTERS_INBOUND_END, +6911049=ORIGIN_CONN_ACQUIRE_START, +6912093=ORIGIN_CONN_ACQUIRE_END, +6935320=OUT_REQ_HEADERS_SENDING, +7014311=OUT_REQ_HEADERS_SENT, +2879404595=IN_REQ_LAST_CONTENT_RECEIVED, +2879487068=OUT_REQ_LAST_CONTENT_SENDING, +2879529603=OUT_REQ_LAST_CONTENT_SENT, +3537669603=IN_RESP_HEADERS_RECEIVED, +3537685802=FILTERS_OUTBOUND_START, +3537838120=FILTERS_OUTBOUND_END, +3537861921=OUT_RESP_HEADERS_SENDING, +3537900333=OUT_RESP_HEADERS_SENT, +3537906991=IN_RESP_LAST_CONTENT_RECEIVED, +3537937795=OUT_RESP_LAST_CONTENT_SENDING, +3537962376=OUT_RESP_LAST_CONTENT_SENT, +3538291470=NOW]}

(there are several entries like this ↑ followed by the original NPE entry)

logger_name: c.n.z.n.s.ClientResponseWriter
message: Received complete event while still handling the request. With reason: CLOSE -- Channel: [id: 0x18e734fc, L:/10.255.0.12:8080 - R:/10.106.65.221:7742], active=true, open=true, registered=true, writable=true, id=18e734fc, Passport: CurrentPassport {start_ms=1665421851993, [+0=IN_REQ_HEADERS_RECEIVED, +25405=FILTERS_INBOUND_START, +605148=IN_REQ_LAST_CONTENT_RECEIVED, +8022162=FILTERS_INBOUND_END, +8472518=ORIGIN_CONN_ACQUIRE_START, +8474740=ORIGIN_CONN_ACQUIRE_END, +8501191=OUT_REQ_HEADERS_SENDING, +8525891=OUT_REQ_LAST_CONTENT_SENDING, +8581036=OUT_REQ_HEADERS_SENT, +8583789=OUT_REQ_LAST_CONTENT_SENT, +57036026=IN_RESP_HEADERS_RECEIVED, +57049009=FILTERS_OUTBOUND_START, +57203853=FILTERS_OUTBOUND_END, +57568295=SERVER_CH_CLOSE, +57582248=IN_REQ_CANCELLED, +57733681=ORIGIN_CH_CLOSE, +57774299=ORIGIN_CH_CLOSE, +58059284=NOW]}
logger_name: c.n.z.n.i.PassportLoggingHandler
message: Incorrect final state! toplevelid = 6426fec6-e938-4bc4-9dd5-d1b959101cbf, Channel: [id: 0x18e734fc, L:/10.255.0.12:8080 ! R:/10.106.65.221:7742], active=false, open=false, registered=true, writable=false, id=18e734fc, Passport: CurrentPassport {start_ms=1665421851993, [+0=IN_REQ_HEADERS_RECEIVED, +25405=FILTERS_INBOUND_START, +605148=IN_REQ_LAST_CONTENT_RECEIVED, +8022162=FILTERS_INBOUND_END, +8472518=ORIGIN_CONN_ACQUIRE_START, +8474740=ORIGIN_CONN_ACQUIRE_END, +8501191=OUT_REQ_HEADERS_SENDING, +8525891=OUT_REQ_LAST_CONTENT_SENDING, +8581036=OUT_REQ_HEADERS_SENT, +8583789=OUT_REQ_LAST_CONTENT_SENT, +57036026=IN_RESP_HEADERS_RECEIVED, +57049009=FILTERS_OUTBOUND_START, +57203853=FILTERS_OUTBOUND_END, +57568295=SERVER_CH_CLOSE, +57582248=IN_REQ_CANCELLED, +57733681=ORIGIN_CH_CLOSE, +57774299=ORIGIN_CH_CLOSE, +58078446=SERVER_CH_CLOSE, +58082086=SERVER_CH_CLOSE, +58140741=NOW]}

Also, all of the problem requests seems to be GraphQL ones

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days.

This issue was closed because it has been stalled for 7 days with no activity.