banzaicloud/spark-metrics

Filter metrics

dszakallas opened this issue · 10 comments

Is your feature request related to a problem? Please describe.

I would like to use push-based metrics collection provided by this plugin alongside the direct scraping approach introduced in Spark 3.0, as the push-based version still provides benefits for some metrics. In order not to overload the push gateway, I want to filter for a select set of metrics (e.g. batch job metrics that have to be reliably scraped), a significantly smaller amount than routing all of them.

Describe the solution you'd like to see

A new configuration parameter that can be used to filter metrics by namespace, or more generally, a regex pattern.

Describe alternatives you've considered

  1. It would be better if this was a feature of the Spark's MetricsSystem instead, so that arbitrary source to sink routing could be established in a plugin agnostic manner. I've asked on the spark-users mailing list about this, but didn't get an answer so far. Spark doesn't provide such a feature currently according to my investigation, and I don't know how long it would take to get this through.

  2. metrics-name-capture-regex could be altered so that it drops metrics that doesn't match the regex, while still fulfilling the original purpose to enable rewriting the contents of its capture groups. However this would be a breaking change, as users are expecting non-matching patterns to be included too.

Additional context

@dszakallas IMHO the right approach is to leverage the metrics filtering support provided by Spark. Currently, all metrics are retained: https://github.com/banzaicloud/spark-metrics/blob/master/src/main/scala/com/banzaicloud/spark/metrics/sink/PrometheusSink.scala#L61

Users should be able to pass in the metric filter class implementation to be used (e.g. metrics-filter-class). This class would be instantiated and passed to ScheduledReporter's constructor.

This way the metric filtering becomes fully customizable.

Seems like the way to go. Just to make sure I understood you correctly, the proposal is to instantiate a user-provided class using reflection to use as the MetricFilter implementation.

Ideally we should be able to provide a way for the user to parametrize the class, e.g. in this particular case, different instances can provide different regexps. This would come in handy for me so that the same class could be used with the direct Prometheus sink with a different regexp, although I don't think that sink has any support for MetricFilters currently.

Spark has a general mechanism for instantiating metrics sinks (and in case of Spark 3, plugins) where each instance is named, and can be parametrized. Adding such a registry for metric filters is quite a disruptive change and should be a Spark feature.

A much more conservative way to go (it only affects this plugin, and doesn't require a registry) is to have a single unnamed metric-filter locally for this plugin. In this case we look for a metrics-filter.class property and if it is defined we search for properties matching metrics-filter.* and pass them to its constructor as Properties.

Of course, there's still the third option of leaving it unconfigurable, and instantiate with the default constructor. If the user wants different regexps, they need to find another means to do it, e.g use different classes.

Ideally we should be able to provide a way for the user to parametrize the class, e.g. in this particular case, different instances can provide different regexps. This would come in handy for me so that the same class could be used with the direct Prometheus sink with a different regexp, although I don't think that sink has any support for MetricFilters currently.

Yes, the Prometheus Sink currently doesn't have any support for MetricFilters.

A much more conservative way to go (it only affects this plugin, and doesn't require a registry) is to have a single unnamed metric-filter locally for this plugin. In this case we look for a metrics-filter.class property and if it is defined we search for properties matching metrics-filter.* and pass them to its constructor as Properties.

The metric filter class should be a property of the sink, e.g.: *.sink.prometheus.metrics-filter-class. (Note: due to the way Spark resolves Sink properties the last element of the config property key can not contain dot hence the metrics-filter-class)

Metrics filter implementations must implement the com.codahale.metrics.MetricFilter interface. The Prometheus Sink will instantiate the class specified in metrics-filter-class using reflection and pass it down to https://github.com/banzaicloud/spark-metrics/blob/master/src/main/scala/com/banzaicloud/spark/metrics/sink/PrometheusSink.scala#L61

If no metrics-filter-class is provided in the config than the Metrics.ALL filter should be used by default as it is now.

Since metric filter implementations may require one or more configuration settings we could pass the path to the config file into the constructor. This way different configuration formats can be supported (e.g. json, yaml, etc)

If there is *.sink.prometheus.metrics-filter-config-path than the value of this Sink property is passed into the constructor of the metrics filter, otherwise, the parameterless constructor is used.

Note: due to the way Spark resolves Sink properties the last element of the config property key can not contain dot hence the metrics-filter-class)

Interesting, I didn't know that.

Since metric filter implementations may require one or more configuration settings we could pass the path to the config file into the constructor. This way different configuration formats can be supported (e.g. json, yaml, etc)

It's a bit cumbersome to require a separate file deployed in the cluster for configuring the metric filter. I would preferred to be passed as a configuration option. How about passing s/metrics-filter-(.*)/\1/, i.e use a leading dash instead of a dot?

Can you expand more on this how the Prometheus Sink configuration would look like?

metrics-filter-class would be the implementor class's name. If provided then properties matching metrics-filter-(.*) are passed as Map[String, String] to the constructor, with unqualified keys. For example:

*.sink.prometheus.metrics-filter-class=mycompany.RegexMetricsFilter
*.sink.prometheus.metrics-filter-regex=mycompany\.(.*)
*.sink.prometheus.metrics-filter-ignore-case=true

will result in something like (using reflection of course)

new RegexMetricsFilter(Map(
  "regex" -> "mycompany\\.(.*)",
  "ignore-case" -> "true"
))

I was thinking something along the lines:

*.sink.prometheus.metrics-filter-class=mycompany.RegexMetricsFilter
*.sink.prometheus.metrics-filter-arg-regex=mycompany\.(.*)
*.sink.prometheus.metrics-filter-arg-ignore-case=true

where part after -arg- denotes constructor parameters. (regex could be mapped to the constructor parameter called regex while ignore-case is mapped to ignoreCase, but your initial proposal of passing all these as a Map would work as well).

If we pass as args we will have to match the types, the names and the order of arguments, i.e. opening up a great big can of worms. The Map approach would defer this task to the implementation. Spark uses the latter for configuring plugins, so we stay fairly conventional if we go with this approach.

I also find the arg prefix superfluous and the name itself is not too descriptive in case of using a property map.

Alright, let's go with the Map approach and drop the arg prefix.

OK, I think everything is clear. I'll draft a PR real quick.