banzaicloud/spark-metrics

Metric Name RegEx Replacement doesn't work with JMX

julianblack opened this issue · 2 comments

My goal, like others before me, has been to replace all of the metrics on the pushgateway each time a new instance of the application runs. I.e. no reference to the application id & no rand ints generated in the metrics names.

I achieved this by doing three things:

  1. Changing the spark.metrics.namespace to the app name.
  2. Using *.sink.prometheus.metrics-name-capture-regex, *.sink.prometheus.metrics-name-replacement to replace rand int and application id. FYI: (application_.*?_.*?_|.*spark_streaming_.*?_.*?_.*?_.*?_.*?_.*?)(.+)
  3. Using the group-key to remove the instance label

The problem is, step no. 2 (*.sink.prometheus.metrics-name-capture-regex, *.sink.prometheus.metrics-name-replacement) only seems to work when jmx collection is disabled. Otherwise it is ignored.

JMX collection gives many valuable stats (specifically for my application Hadoop S3A file system metrics, and Kafka producer metrics. )

Is this an oversight? a bug? Or intended behavior?

Much appreciate the replies.

That seems to be an oversight.

Have you tried to achieve the metric name replacement through the jmx exporter config you provided to the sink? (see as en example https://github.com/prometheus/jmx_exporter/blob/master/example_configs/spark-3-0.yml)

Thanks @stoader. This is a good work around to achieve all the replacement needs.