Metrics not available in Prometheus
domcar opened this issue · 4 comments
What did you do
I deployed cloudwatch exporter with helm. Everything looks fine, i also get the metrics
# HELP aws_vpn_tunnel_state_average CloudWatch metric AWS/VPN TunnelState Dimensions: [VpnId] Statistic: Average Unit: None
# TYPE aws_vpn_tunnel_state_average gauge
aws_vpn_tunnel_state_average{job="aws_vpn",instance="",vpn_id="vpn-redacted",} 0.0 1675847100000
# HELP aws_rds_freeable_memory_average CloudWatch metric AWS/RDS FreeableMemory Dimensions: [DBInstanceIdentifier] Statistic: Average Unit: Bytes
# TYPE aws_rds_freeable_memory_average gauge
aws_rds_freeable_memory_average{job="aws_rds",instance="",dbinstance_identifier="redacted",} 1.23510784E8 1675847100000
aws_rds_freeable_memory_average{job="aws_rds",instance="",dbinstance_identifier="redacted",} 3.73026816E8 1675847100000
# HELP aws_rds_cpuutilization_average CloudWatch metric AWS/RDS CPUUtilization Dimensions: [DBInstanceIdentifier] Statistic: Average Unit: Percent
# TYPE aws_rds_cpuutilization_average gauge
aws_rds_cpuutilization_average{job="aws_rds",instance="",dbinstance_identifier="redacted,} 17.47470875485408 1675847100000
aws_rds_cpuutilization_average{job="aws_rds",instance="",dbinstance_identifier="redacted",} 17.02471625472909 1675847100000
The problem is that the metrics aren't shown in prometheus, to be more precise, the metrics are available, meaning that prometheus can scrape the target, but the query gives back "Empty results"
Other metrics like for example "cloudwatch_exporter_build_info" are correctly shown in prometheus. Could the reason be that those values are gauge but actually there are two values instead of one? Can I change this?
Environment
- Exporter version:
0.15.0
Exporter configuration file
expand
config: |-
region: us-east-2
period_seconds: 240
metrics:
- aws_dimensions: [VpnId]
aws_metric_name: TunnelState
aws_namespace: AWS/VPN
aws_statistics: [Average]
- aws_dimensions: [DBInstanceIdentifier]
aws_metric_name: FreeableMemory
aws_namespace: AWS/RDS
aws_statistics: [Average]
- aws_dimensions: [DBInstanceIdentifier]
aws_metric_name: CPUUtilization
aws_namespace: AWS/RDS
aws_statistics: [Average]
The metrics will be visible in Prometheus if you look >10 minutes in the past (try the graph view).
This is an unfortunate result of a fundamental mismatch between CloudWatch and Prometheus. CW metrics converge over time, that is, the value at time T can change up to some later time T+dT. Meanwhile, Prometheus assumes that once it has scraped a sample, that is the truth, and the past does not change.
To compensate for this, by default the exporter delays fetching metrics, that is, it only asks for data 10 minutes later, when almost all AWS services have converged. It also reports to Prometheus that this sample is from the past. Because Prometheus, for an instant request, only looks back 5 minutes, it never sees any data "now".
@matthiasr I actually solved the problem by setting this to false in the configuration
set_timestamp: false
That works but keep in mind that metrics will show up shifted by 10 minutes (or whatever you configured delay_seconds
to be). That is, if your database CPU utilization spikes at 11:40, the spike will show up around 11:50 in Prometheus which can make debugging difficult. Depending on the particular metrics you collect, you may be able to get away with a lower delay, this depends on the concrete AWS service and even metric.
Thank a lot for the tip