beam-telemetry/telemetry_metrics

Remove event/name/measurement inference

bryannaegele opened this issue · 9 comments

One of the biggest sticking points to adopting Metrics I have heard over time is understanding how it correlates to telemetry events when presented with the shorthand definition. Potential users have frequently told me that correlating how a metric definition that takes a string translates to an event (list of atoms) and a measurement (key in a map). When presented with a definition composing a string name for the metric, a list of atoms for the event it is triggered by, and an atom for which measurement key to use, they immediately connect the dots.

I propose that we deprecate and subsequently remove support for the inference mechanism and explicitly require name, event, and measurement. I believe this would reduce confusion to potential users, increase adoption, and remove some magic. We've gone by a general philosophy of erring on the side of requiring explicit definition by users in other libraries beam-telemetry/telemetry#61 (comment). I feel this library would benefit from erring on the side of explicit in this case.

There are two things to consider here:

  1. We have been explicit on the publishing side of things but I think we can take some liberties on the consuming side of things. It doesn’t mean the liberties we took today are good - I’m just pointing out there are two distinct sides

  2. I would propose we first discuss and introduce a new API before we consider deprecating the old one. I would also like a place where we can discuss this feedback directly with users because our current syntax is one of the first things we discuss in our docs but perhaps people are getting to learn about metrics through other means. For example, would a comment on the generated Phoenix telemetry file help?

I agree that discoverability might a problem here.

For example, would a comment on the generated Phoenix telemetry file help?

+1 to that!

And yes, the current API is not completely clear, so introducing a more explicit alternative sounds like a good idea. One thing to keep in mind, though, is that we might add extra confusion because it's not clear which version of the API is used.

We can consider an approach where we keep the current shape of the API, but log a warning if event_name and measurement options are not passed to the constructor. With time, we will make them required.

I think the option format would be too verbose. I would rather support counter([:my_app, :request], :system_time) or similar but I am not sure if that is any better.

Since event_name and measurement are required and principal arguments here, I think it's better to move it to the top level arg, instead of making them as overridable from options.

Also I think it would be better to not generate name - so move it to options , and reporter handle it as needed (e.g. rebuild "short name" from event name and measurement.

Here are example of my proposal:

# current doc
counter(
  "http.request.count",
  tags: [:controller, :action]
)

# proposal - a tuple of event name and measurement since they are together
counter(
  {[:my_app, :request], :system_time},
  tags: [:controller],
  name: "http.request.count"
)

# proposal - event name and measurement as separate args
counter(
  [:my_app, :request],
  :system_time,
  tags: [:controller],
  name: "http.request.count"
)

This is a little bit verbose, but at least it does not "auto transformation" of string into two things (event name and measurement)... and also it allows reporters to choose their own convention for building metric name, or users to specify it (see beam-telemetry/telemetry_metrics_prometheus_core#31)

I kind of like the 3-arity version. It is somewhat similar to the current implementation, but with two explicit arguments instead of one that's being split. But if the name is optional and reporter-specific, I'd argue to move it to :reporter_options.

Still, not clear to me how we would handle the transition. @josevalim what do you think would be the best approach, if we decide to make any change?

I like the three arity version too, clear enough for me. But I think that, if we go down this route, the name should be removed from metrics as it is not really needed, then it goes to reporter_options after all. So I don't think it would address the issue that brought @chulkilee here.

If telemetry metrics struct is to capture all info for reporter, then what about moving reporter options to the top level argument? Or we may extract metrics opts and then treat all remaining opts as reporter opts. The latter pattern is common in libs (e.g. ecto to get ecto opts and the pass to other deps such as db_connection or db adapter). I see there are pros and cons though for each approach 😉

Or if the struct is mainly to capture only metric definition, and all reporter specific option is "extra" info just happened to be embedded here - what about passing {metric_struct, reporter_opts} to reporters if additional opts is needed for reporter? This is somewhat similar to supervisor children list where its item is a value (module) or a tuple for additional side opt.

Just giving some food for thoughts 😃

Folks, shall we go ahead and close this? While the three arity version looks nice, the library has been out for so long, that I don't think we can deprecate the current syntax. So we either have two ways of doing it or we keep the current one, and I am more inclined towards the latter. Thoughts?

This was probably only actionable 3 years ago, to your point. We could revisit with a 2.0 if we wanted to proceed.