plugin/ocgrpc incorrect metrics for bidirectional stream
pjanotti opened this issue · 4 comments
Describe the bug
Metrics for ocgrpc on bidirectional stream seem to be collected only for initialization and cancellation.
To Reproduce
Steps to reproduce the behavior:
- Clone the census-instrumentation/opencensus-service repo
- From the repo root launch:
go run ./cmd/occollector/main.go --debug-processor
this process will show Prometheus metrics at http://localhost:8888/metrics - From the repo root launch:
go run ./example/main.go
- After 10 seconds check http://localhost:8888/metrics, it was expected to have metrics for
grpc_server_method="opencensus.proto.agent.trace.v1.TraceService/Export"
but there are only forgrpc_server_method="opencensus.proto.agent.trace.v1.TraceService/Config"
- Terminate the process started on step 3, then metrics for
grpc_server_method="opencensus.proto.agent.trace.v1.TraceService/Export"
will show up (covering the cancellation event)
The relevant source code on opencensus-service repo is: https://github.com/census-instrumentation/opencensus-service/blob/38c9550146b49e0bb95ef1784df56a187e912dab/internal/observability.go#L110-L113
Expected behavior
Metrics for both methods and specially for the data sent via Export
Additional context
See census-instrumentation/opencensus-service#287
This sounds like a bug. The bidirectional metrics should be per call not just for init and cleanup. Otherwise, they are useless.
In Java:
Before we don't have real-time metrics reporting for streaming RPCs. You won't be able to see the metrics until RPC finished (which is the same to the scenario @pjanotti described). However recently gRPC added some additional real-time reporting measures (grpc/grpc-java#5099). Those measures are meant to be used for reporting metrics in real-time for long-lived RPCs. Not sure whether this is available in Go.