signalfx/splunk-otel-collector

cpu scraper metrics and attributes missing

neilharris123 opened this issue · 3 comments

I've been reading this documentation regarding the cpu scraper, the available metrics ,and those metrics attributes here.

I have the cpu scraper configured in our agent conifg:

receivers:
  hostmetrics:
    collection_interval: 10s
    scrapers:
      cpu:
      disk:
      filesystem:
      memory:
      network:
      load:
      paging:
      processes:
...

processors:
  resourcedetection:
    detectors: [gcp, ecs, ec2, azure, system]
    override: true
...

service:
  pipelines:
    metrics:
      receivers: [hostmetrics, otlp, signalfx, smartagent/signalfx-forwarder]
...

When I check the available metrics in SignalFX, I see system.cpu.time, but not system.cpu.utilization.
Also, when I check the state attribute off system.cpu.time, I only see idle, not any of the other 8 state values suggested in the documentation.
We have 300+ instance in our infra, so I find it very unlikely that every CPU is only reporting an idle state, and not others:

image

Is this a bug, or is there something missing from the suggested config in the documentation?

Which metric exporter are you using and would you be able to share its redacted config? If using the signalfx exporter, there are default translations and exclusions that convert host metrics to Splunk IM conventions to support existing user and built-in content. We are in the process of better documenting this process in our product docs. Unfortunately there was an issue with system.cpu.time filtering that was fixed for the latest release.

Hi @rmfitzpatrick

Here is our metric exporter config:

exporters:
  # Metrics + Events
  signalfx:
    access_token: "${SPLUNK_ACCESS_TOKEN}"
    api_url: "${SPLUNK_API_URL}"
    ingest_url: "${SPLUNK_INGEST_URL}"
    sync_host_metadata: true
    correlation:

Please let me know if you need anything else, thanks!

Looks like many of the metrics I required were on the excluded list as you suggested (cpu.wait, cpu.system etc) so I can configure the agent to include them, as an alternative to requiring the attributes of system.cpu.time. I presume this would generate the same data. Thanks for your help, I probably would never have found the doc on the exclusions on my own.