Metric Name RegEx Replacement doesn't work with JMX
julianblack opened this issue · 2 comments
My goal, like others before me, has been to replace all of the metrics on the pushgateway each time a new instance of the application runs. I.e. no reference to the application id & no rand ints generated in the metrics names.
I achieved this by doing three things:
- Changing the
spark.metrics.namespace
to the app name. - Using
*.sink.prometheus.metrics-name-capture-regex
,*.sink.prometheus.metrics-name-replacement
to replace rand int and application id. FYI:(application_.*?_.*?_|.*spark_streaming_.*?_.*?_.*?_.*?_.*?_.*?)(.+)
- Using the
group-key
to remove the instance label
The problem is, step no. 2 (*.sink.prometheus.metrics-name-capture-regex
, *.sink.prometheus.metrics-name-replacement
) only seems to work when jmx collection is disabled. Otherwise it is ignored.
JMX collection gives many valuable stats (specifically for my application Hadoop S3A file system metrics, and Kafka producer metrics. )
Is this an oversight? a bug? Or intended behavior?
Much appreciate the replies.
That seems to be an oversight.
Have you tried to achieve the metric name replacement through the jmx exporter config you provided to the sink? (see as en example https://github.com/prometheus/jmx_exporter/blob/master/example_configs/spark-3-0.yml)
Thanks @stoader. This is a good work around to achieve all the replacement needs.