Timeout in StackdriverSender kills the flushThread of AsyncReporter
pverkest opened this issue · 4 comments
We're having the following exception that causes our zipkin-reporter-service to stop sending traces to Stackdriver:
2019-05-17 06:42:10.486 WARN ... z.r.AsyncReporter$BoundedAsyncReporter : Unexpected error flushing spans
java.lang.IllegalStateException: timeout waiting for onClose. timeoutMs=5000, resultSet=false
at zipkin2.reporter.stackdriver.internal.AwaitableUnaryClientCallListener.await(AwaitableUnaryClientCallListener.java:45)
at zipkin2.reporter.stackdriver.internal.UnaryClientCall.doExecute(UnaryClientCall.java:46)
at zipkin2.Call$Base.execute(Call.java:379)
at zipkin2.Call$Mapping.doExecute(Call.java:237)
at zipkin2.Call$Base.execute(Call.java:379)
at zipkin2.reporter.AsyncReporter$BoundedAsyncReporter.flush(AsyncReporter.java:286)
at zipkin2.reporter.AsyncReporter$Builder$1.run(AsyncReporter.java:190)
Exception in thread "AsyncReporter{StackdriverSender{parts-prod}}" java.lang.IllegalStateException: timeout waiting for onClose. timeoutMs=5000, resultSet=false
at zipkin2.reporter.stackdriver.internal.AwaitableUnaryClientCallListener.await(AwaitableUnaryClientCallListener.java:45)
at zipkin2.reporter.stackdriver.internal.UnaryClientCall.doExecute(UnaryClientCall.java:46)
at zipkin2.Call$Base.execute(Call.java:379)
at zipkin2.Call$Mapping.doExecute(Call.java:237)
at zipkin2.Call$Base.execute(Call.java:379)
at zipkin2.reporter.AsyncReporter$BoundedAsyncReporter.flush(AsyncReporter.java:286)
at zipkin2.reporter.AsyncReporter$Builder$1.run(AsyncReporter.java:190)
The timeout in AwaitableUnaryClientCallListener throws an IllegalStateException, which causes the AsyncReporter flushThread to stop sending spans.
Is it possible to use another exception type and to ignore the spans that cause the timeout instead of aborting the thread?
probably IOException is appropriate
We're also seeing this issue (Spring Boot 2.1.5; spring-cloud-gcp-starter-trace project, OpenJDK 11 ). Increasing the timeout to 60s has no effect.
Are there any known workarounds?
Thanks
@adriancole IllegalStateException
is a runtime exception, though, and both IOException
and RuntimeException
seem to be handled in zipkin-reporter-java's AsyncReporter here. But then IllegalStateException
is handled specially and rethrown.
Would you recommend special handling in zipkin-gcp
to convert IllegalStateException
to another type when sending spans? Alternatively, perhaps AsyncReporter
does not need to rethrow every IllegalStateException
?
Agree that AsyncReporter
shouldn't rethrow every IllegalStateException
- sent openzipkin/zipkin-reporter-java#166 to fix