launchdarkly/java-server-sdk

Record exception in evaluation error

mightyguava opened this issue · 5 comments

Is your feature request related to a problem? Please describe.
When an evaluation fails due to an exception (like a bug in this library), the only thing surfaced to the caller via the *Detail methods is that an exception occurred. There is a separate log statement by the library that records the Java type of the exception, with no stacktrace. The stacktrace can be logged by enabling debug logs, which requires a server restart. Enabling debug logs isn't really feasible on production at our volume, but sometimes these bugs are only easily reproducible on production.

Describe the solution you'd like

EvaluationReason should contain the full exception that caused the failure. It can be in a new subclass of EvaluationReason

Describe alternatives you've considered
... or just another method on EvaluationReason.Error. Or, an option to invoke the get*() methods that allow the exception to propagate up rather than being swallowed. We prefer to handle the exceptions ourselves, and have been using the *Detail() methods to simulate it, but unfortunately they don't contain the stacktrace.

Additional context
We are seeing an NPE coming from this library in production, and debugging it has not been easy. (This is the real bug I'd like to fix, but until there's meaningful debugging information filing a bug probably won't be helpful).

This seems reasonable, but I'm slightly confused by the statement that it's not feasible to enable debug logging. What implementation of SLF4J are you using? In most logging frameworks it should be possible to turn on DEBUG level only for a single class (in this case LDClient) rather than across the entire SDK, and as far as I can tell, LDClient does not log any debug-level messages under any non-error conditions.

Not feasible as in the service under question is currently generating nearly 1TB of logs per day, and enabling debug logging would not go well with the folks who maintain our logging cluster. I can see about turning on DEBUG logs for this specific class, but with the layers of abstractions in our service, it's likely it'll be much faster and a lot less work to just make the change here.

Sorry for the delay - we had some release issues. The 4.11.0 release is out now which includes this feature (method getException() on EvaluationReason.Error).

Thank you! I will upgrade and try to capture the error to reproduce, and file a new bug later.

@mightyguava We are also still trying to track down what the NPE might have been about, based on what you said in the support ticket, but the stacktrace will probably be the biggest clue to that.