open-telemetry/opentelemetry-java

Lots of duplicate metric definitions for micrometer shim

jack-berg opened this issue · 5 comments

If you run a simple spring boot app with spring-boot-starter-actuator and with the micrometer1-shim, you get lots of duplicate metric definition warnings like:

14:57:23.351 [main] WARN  io.opentelemetry.sdk.metrics.internal.state.MetricStorageRegistry - Found duplicate metric definition: jvm.threads.states
        at unknown source
                To enable better debugging, run your JVM with -Dotel.experimental.sdk.metrics.debug=true
Causes
- Description [The current number of threads having RUNNABLE state] does not match [The current number of threads having NEW state]
- InstrumentDescription [The current number of threads having RUNNABLE state] does not match [The current number of threads having NEW state]
Original instrument registered with same name but is incompatible.
        at unknown source
                To enable better debugging, run your JVM with -Dotel.experimental.sdk.metrics.debug=true

14:57:23.352 [main] WARN  io.opentelemetry.sdk.metrics.internal.state.MetricStorageRegistry - Found duplicate metric definition: jvm.threads.states
        at unknown source
                To enable better debugging, run your JVM with -Dotel.experimental.sdk.metrics.debug=true
....

I enabled better debugging with Dotel.experimental.sdk.metrics.debug=true and found that the issue seems to be an incompatibility between the micrometer definition of a metric description and the opentelemetry description. In micrometer it appears that each unique set of tags on an instrument can have its own description. Check out the JvmThreadMetrics for an example.

In opentelemetry an instrument is defined by its name, type, description, and unit. If an instrument with the same name, but a different type, description, or unit than one previously created, you've created a semantic error an we log the warnings I've included on this ticket.

We should probably drop micrometer instrument descriptions to ensure we don't create these semantic errors. The unit can be different for each set of tags as well, so we should consider dropping that as well. However, in practice I didn't see any conflicts created by differing units.

@jonatan-ivanov there's a mismatch between the otel metric data model and micrometer. In otel, a metric has a name, description, a unit, and aggregations for each distinct set of attributes. In micrometer, it appears that a metric has a name, and each set of tags (akin to attributes) may have its own description and unit.

We talked about this in the 4/14 Java SIG and were curious if there are any plans to change the 2.x micrometer API to associate description and unit with the instrument name, instead of allowing them to vary for each set of tags. If not, could it be considered?

Is there any current short-term-fix to this? Should I disable my micrometer on the application's pom.xml or enable anything special in the collector?

@jack-berg Based on user feedback, we decided to postpone 2.x indefinitely.
This can be "weird" for other registries as well, e.g.: Prometheus, where the description is for a metric group so if there are multiple descriptions (one for each time series), you will only see one (the first one I guess).

We can try to improve this by removing these dynamic parts from the micrometer instrumentation (i.e.: the thread state metrics you mentioned) but this won't fix other instrumentation or when the user sets the description dynamically (though these cases should be rare).

@rafaribe @jack-berg A short term solution can be adding a MeterFilter that can change the description for the problematic Meters.

@jonatan-ivanov here's a list of all the duplicate metrics I found when running a spring boot application with the spring boot actuator and the micrometer shim:

Looks like its not too bad after all! Just noisy logs.