avast/datadog4s

Timers are registread as a Counters in statsD

dfurmans opened this issue ยท 11 comments

Describe the bug
The timer is stored as a counter entity

To Reproduce
Steps to reproduce the behavior:

          val interceptResponseTimeExecution: IO[F[Tag]] = factoryResource.use { factory =>
            val timer = factory.timer.distribution("latency", timeUnit = TimeUnit.MILLISECONDS)
            val measureTime: IO[Response] = timer.time(
              IO.pure(generateResponse(req, matches)), Tag.of("CCCC", "BBBB")
            )
            for {
              executionTime <- measureTime           // intercept required time execution
            } yield {
              val aTag  = serializeResponse(executionTime) + "\n"
              aTag.pure[F]                      // outside effect change
            }

Expected behavior
Should be stored as a timer entry

Additional context
It is stored as a counter entity

 echo "counters" | nc localhost 8126
{
  'statsd.bad_lines_seen': 0,
  'statsd.packets_received': 0,
  'statsd.metrics_received': 0,
  'response.200': 0,
  'response.auth.200': 0,
  'my-app-name.hits': 0,
  'my-app-name.latency;success=true;CCCC=BBBB': 0
}
END

The same for Histogram

Hello, thanks for the report.

Are you sure this is the issue? I believe timers (regardless on whether based on timer or distribution or histogram contain multiple metrics:
my-app-name.latency which is indeed a counter, but it should also contain my-app-name.latency.95thpercentile for example

What issue is this causing you? This is handled by datadog java library so i suspect it should behave correctly

Hi @tomasherman !
Thanks for further documentation and references.

Looks like there is an issue for record any other types of metrics except counter.
Does not exist my-app-name.latency.95thpercentile

Another issue that I have found - there is a little mismatch in documentation about Timers

 Timers provide you with .time[A](fa: F[A]): F[A] method

Does not exist such a method. There is another one, where as a second parameter we have Tags* - and here is my question - how we should construct a Tag for such kind of timers โ“

Sorry it should be called 95percentile

Regarding the time method, you are correct that it contains a vararg tags section, but you should be able to use it as if it didn't. Basically, you should be able to call it without providing the tags, e.g. timer.time(F.delay(...)) should work fine

closing this, as this is unlikely an issue (that i can solve anyway) ... feel free to reopen for more discussions / ideas ;)

Thanks @tomasherman for all your time and input.
We found a bug that is related to TimeUnit.MILLISECONDS conversion in com.avast.datadog4s.api.metric.ElapsedTime.durationInstance

The conversions from NANOSECONDS to MILLISECONDS does not work correctly.
image

Please consider the example Scala REPL Snippet example:
scala> val s = java.time.Duration.ofNanos(194702) val s: java.time.Duration = PT0.000194702S
scala> s.toMillis val res9: Long = 0
Which results as 0.

I will take a look tomorrow, thanks for the report

@dfurmans Hey Dawid, please verify that i understand the issue you reported correctly. Your code ran for 194702 nano seconds, which is 194 microseconds, which is 0.19 milliseconds.

I'm not quite sure how to tackle this. What would be the expected behaviour for you? 0.19? I think what you should do in this case is to use TimeUnit.MICROSECONDS or TimeUnit.NANOSECONDS to indicate the code that you need finer precision than millies. Would that make sense for you?

Otherwise, i would need to update the code to handle doubles instead of longs. It might be possible but i will need some time to think about the consequences of that

Your code ran for 194702 nano seconds, which is 194 microseconds, which is 0.19 milliseconds.

That is correct!
We are strongly recommend to use a higher precision rate - here as you pointed our correctly, with the current implementation
we are loosing the precision after unit type change (in our case TimeUnit.MICROSECONDS)
With the TimeUnit.NANOSECONDS the problem does not exist.

It might be possible but i will need some time to think about the consequences of that

It is okey for us. Maybe this is related to the JVM version as well - not sure about that, just in case the version of Scala and Java that we found this issue is Scala 2.13.3 (OpenJDK 64-Bit Server VM, Java 11.0.2)

As a workaround we are gonna to use a manual time recording as

val (response, timeExecution) = ???  
timer.recordTime(timeExecution, TimeUnit.MILLISECONDS)

Thanks in advance!

It is okey for us. Maybe this is related to the JVM version as well - not sure about that, just in case the version of Scala and Java that we found this issue is Scala 2.13.3 (OpenJDK 64-Bit Server VM, Java 11.0.2)

I don't think it is - it's a result of using java Duration api which does return long and therefore Duration of 194702ns can not be represented as 0.19ms (it has to be long, not double)

One thing you could also do is when creating the timer using the TimerFactory forcing it to use NANOSECONDS as default time unit - see nanoTimer in the user guide example here: https://github.com/avast/datadog4s/blob/master/site/docs/userguide.md#example

This will cause all metrics being sent to be sent with nanosecond precision. Of course, you will need to update the metric settings in datadog app so that it represents the metric correctly (as nanoseconds, not miliseconds)

I am not sure im making myself clear, i hope i do :) I will try to put it a different way - when creating the timer, you can choose the precision with which it will report the durations. By default it's milliseconds. But if your operations are super fast and miliseconds are not finegrained enough, you can set it to micro or nano seconds

I'm going to close this issue. I don't think there are any actions that need to happen on my side. If that's not correct, please feel free to reopen or open a new one.