pambrose/prometheus-proxy

Unable to handle metrics data larger than grpc default max message size

trew opened this issue · 5 comments

trew commented

I have many docker containers running on a host and the /metrics endpoint exposed from cAdvisor is over 4Mb (default max message size in grpc). I would like to be able to configure the max message size on the proxy. (I'm guessing around here: https://github.com/pambrose/prometheus-proxy/blob/master/src/main/kotlin/io/prometheus/proxy/ProxyGrpcService.kt#L76)

Related issues: grpc/grpc-java#3996

Error on the agent

prometheus-agent_1  | 15:40:21.667 ERROR [AgentGrpcService.kt:249] - Error in writeResponsesToProxyUntilDisconnected(): CANCELLED HTTP/2 error code: CANCEL

Stacktrace on the proxy

prometheus-proxy_1  | WARNING: Exception processing message
prometheus-proxy_1  | io.grpc.StatusRuntimeException: RESOURCE_EXHAUSTED: gRPC message exceeds maximum size 4194304: 4678937
prometheus-proxy_1  | 	at io.grpc.Status.asRuntimeException(Status.java:524)
prometheus-proxy_1  | 	at io.grpc.internal.MessageDeframer.processHeader(MessageDeframer.java:387)
prometheus-proxy_1  | 	at io.grpc.internal.MessageDeframer.deliver(MessageDeframer.java:267)
prometheus-proxy_1  | 	at io.grpc.internal.MessageDeframer.deframe(MessageDeframer.java:177)
prometheus-proxy_1  | 	at io.grpc.internal.AbstractStream$TransportState.deframe(AbstractStream.java:193)
prometheus-proxy_1  | 	at io.grpc.internal.AbstractServerStream$TransportState.inboundDataReceived(AbstractServerStream.java:266)
prometheus-proxy_1  | 	at io.grpc.netty.NettyServerStream$TransportState.inboundDataReceived(NettyServerStream.java:252)
prometheus-proxy_1  | 	at io.grpc.netty.NettyServerHandler.onDataRead(NettyServerHandler.java:478)
prometheus-proxy_1  | 	at io.grpc.netty.NettyServerHandler.access$800(NettyServerHandler.java:101)
prometheus-proxy_1  | 	at io.grpc.netty.NettyServerHandler$FrameListener.onDataRead(NettyServerHandler.java:787)
prometheus-proxy_1  | 	at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.onDataRead(DefaultHttp2ConnectionDecoder.java:292)
prometheus-proxy_1  | 	at io.netty.handler.codec.http2.Http2InboundFrameLogger$1.onDataRead(Http2InboundFrameLogger.java:48)
prometheus-proxy_1  | 	at io.netty.handler.codec.http2.DefaultHttp2FrameReader.readDataFrame(DefaultHttp2FrameReader.java:422)
...

Ah, cool! That is an interesting twist. I did not realize that prometheus data sources could ingest such large data elements. I know how to deal with large msgs in gRPC. Give me a few days to work on it and I will get back to you.

I added support for large messages in the 1.6.0 release. Give it a try and see if it fixes your problem.

I should also mention that the default streaming message size in the 1.6.0 chunking implementation is 32kb, which is based on this thread: grpc/grpc.github.io#371
See what your performance is like and, if necessary, you can tweak the message size.

I added support for zipping chunked and non-chunked content in 1.6.1. Give that a try.

trew commented

I just updated and it looks very promising. No issues so far and data is coming in as expected. Thanks for a great tool and quick response time!