SAP/cf-java-logging-support

Add `exception_type` field to the logs

KaiHofstetter opened this issue · 4 comments

Feature Request

Please add an exception_type field to the logs, which contains the Java exception type from the logged stacktrace (e.g. java.lang.NullPointerException) in case a stack trace was logged.
This would allow to easily monitor the uncaught exception types independent of the used monitoring/observability tools (e.g. OpenSearch)

Use Case

We monitor the raised and uncaught exceptions of our project, because we want to know whether we had any uncaught exceptions and how many of them, e.g. java.lang.NullPointerException.

An uncaught exception is logged in the stacktrace field, e.g.

"java.lang.NullPointerException: Cannot invoke \"String.replace(java.lang.CharSequence, java.lang.CharSequence)\" because \"value\" is null",
...
"\tat org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)",
"\tat org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:793)",
"\tat org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)",
"\tat org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:763)",
"\tat org.springframework.aop.interceptor.AsyncExecutionInterceptor.lambda$invoke$0(AsyncExecutionInterceptor.java:115)",
"\tat java.base/java.util.concurrent.FutureTask.run(Unknown Source)",
"\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)",
"\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)",
"\tat java.base/java.lang.Thread.run(Unknown Source)"

The first line of these logs contain the exception type, e.g. java.lang.NullPointerException.
We monitor these exceptions types via OpenSearch dashboards.

Our current approach is to use a scripted field in OpenSearch to extract the exception type from the stack trace:

def stacktrace = params._source['stacktrace'];

if (stacktrace !=null && stacktrace.size() > 0){
 def m = /\s*((?:\w*\.)*\w*Exception).*/m.matcher(stacktrace[0]);
 if (m.matches()){
     return m.group(1);
 }
}

The scripted field extracts via a regular expression the exception type from the first line of the stacktrace field.

This approach has several disadvantages:

  • Everyone who wants to monitor the exceptions type needs to re-implement such an exceptions type extraction, in her/his specific logging environment.

  • Scripted fields are executed on the fly during request time and are per se slower than regular fields.
    Since a regular expression needs to be used in this case, the request runs into request timeouts if longer evaluation periods are selected in OpenSearch (e.g. > 7 days).

  • This scripted field needs to use the params._source parameter, in order to have the original order of the stack trace entries.
    This params._source parameteris not available for filters, so it cannot be filtered byexception_type`.

Thanks for submitting this issue. I understand, you want to add throwable.getClass().getName() of the encountered throwable/exception, don't you? At least, this is, what the Opentelemetry Java Agent would do in its Logback Appender. Would it make sense to add the message of the throwable as well, while we are at it? Both, the class name and the message would only be taken from the first exception of the stacktrace with that approach. Would that be sufficient for you?

Yes, the field should contain the throwable.getClass().getName()
Sorry, I forgot that the library has access to the throwable object and doesn't need to do some "regex magic".

Yes, it would actually make sense to add throwable.getMessage() in separate field as well.
This would allow us to monitor/count unique exceptions (with different messages).

@KaiHofstetter: Please have a look, if #161 meets your requirements.