opstree/druid-exporter

Exporter override metrics value if there is more than one task in parallel

Closed this issue · 3 comments

Currently the exporter export metrics with these dimensions:
cluster, datasource, environment, host, instance, job, metric_name, service
However some of the metrics coming from druid contains more dimensions that being ignored.

For example "ingest\events\processed" have a different metric per task and contains dimensions like taskId and taskType.

In these cases if the metrics got from druid have multiple records with the same values on: datasource, host, metric_name, service but with different task id (for example), only the last record will be taken the rest will be ignored.

For example for these metrics:
[
{
"feed": "metrics",
"timestamp": "2021-06-28T08:15:37.162Z",
"service": "druid/historical",
"host": "",
"version": "2021.05.1-iap",
"metric": "query/wait/time",
"value": 12,
"dataSource": "dataSource",
"duration": "PT7689600S",
"hasFilters": "true",
"id": "66dc5e27-8c0c-4666-8445-54b89313b9ce",
"numComplexMetrics": "0",
"numDimensions": "2",
"numMetrics": "3",
"sqlQueryId": "query1",
"type": "groupBy"
},
{
"feed": "metrics",
"timestamp": "2021-06-28T08:15:37.163Z",
"service": "druid/historical",
"host": "",
"version": "2021.05.1-iap",
"metric": "query/wait/time",
"value": 25,
"dataSource": "dataSource",
"duration": "PT7689600S",
"hasFilters": "true",
"id": "defb292c-f395-4e26-8fd6-00d8d99211d3",
"numComplexMetrics": "0",
"numDimensions": "2",
"numMetrics": "3",
"sqlQueryId": "query2",
"type": "groupBy"
}
]
The record from the exporter will look like this:
druid_emitted_metrics{cluster="cso", datasource="datasource", environment="2-development", host="", instance="druid-exporter", job="druid", metric_name="query-wait-time", service="druid-historical"} | 25

This is not correct because there should be a different metric for each metric from druid or at least the value should be aggregation of the metrics.

Note: you can get the aggregated value by using "druid_emitted_metrics_histogram_sum" metrics and calculate the increase or rate of the value by time.

If we have multiple EC2s do we need to run druid exporter on each machine and add it in prometheus config?

Will be fixed in #120

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.