Issues
- 12
- 9
dcgm-exporter missing many metrics after upgrade
#143 opened by huww98 - 10
Error watching fields: Profiling is not supported for this group of GPUs or GPU
#119 opened by motionlife - 9
- 6
dcgm-exporter high cpu usage
#133 opened by ysshaoxiao - 12
- 1
- 1
No labels with GPU-Card Name in dcgm-exporter
#91 opened by vizdrag - 12
Pods with dcgm-exporter fail to start
#120 opened by timClicks - 12
how prometheus get dcgm-exporter metrics?
#106 opened by Leteong - 3
dcgm-exporter doesn't start on Docker
#134 opened by gurapomu - 5
Grafana: GPU power total gauge, sum not useful
#109 opened by mjpieters - 4
Custom metrics issue
#122 opened by PaulYuanJ - 0
- 2
dcgm-exporter: DCP metrics not enabled
#140 opened by jelmd - 1
Erro start dcgm-exporter pod - module of DCGM that is not currently loaded
#141 opened by josericardomcastro - 1
dcgm-exporter crashes after MIG reconfiguration
#142 opened by kpouget - 0
nvmlShutdown dlcloses all handles every time
#139 opened by robertdavidsmith - 0
Allow helm chart to customize kubelet path
#137 opened by vdebergue - 0
Error checking GPU health: API version mismatch
#135 opened by noliaoliao - 4
K8s Pod/namespace information in exported fields
#129 opened by geoberle - 1
Failed to install gpu-helm-charts/dcgm-exporter
#132 opened by jasperzhong - 0
- 1
whether A100 mig is supported
#127 opened by zhcf - 0
FR: Expose option to use JSON logs
#128 opened by etherandrius - 1
Consider adding a 'nodename' label
#110 opened by mjpieters - 0
Exposed metrics don't follow Prometheus spec
#126 opened by etherandrius - 0
- 0
Error getting process info: Setting not configured
#116 opened by CermakM - 1
Unable to start according to the instructions
#112 opened by zkbutt - 0
- 0
DCGM Python3 bindings
#114 opened by Tabrizian - 0
dcgm-exporter POD CrashLoopBackOff or Error
#111 opened by Leteong - 1
Failed to initialize NVML
#90 opened by guleng - 2
Broken link in README
#101 opened by kshcherban - 1
- 0
dcgm-exporter POD CrashLoopBackOff or Error
#105 opened by Leteong - 0
nvml.h | Request for Fan Speed RPM (not percent) | NV_CTRL_THERMAL_COOLER_SPEED
#107 opened by berglh - 0
all my data is '0'
#104 opened by darkamumu - 0
Bare Metal | /run/prometheus/dcgm.prom Not Present
#103 opened by atulyadavtech - 2
- 0
- 1
Make the helm chart available via hub.helm.sh
#98 opened by patrungel - 0
dcgm-exporter POD cannot be running
#94 opened by nakkoh - 3
Option to pass hostname/ip along with port
#85 opened by bbelgodere - 0
method is not support test suit
#88 opened by cuisongliu - 1
TLS Support
#80 opened by RenaudWasTaken - 2
Fan status and card count requirements
#84 opened by guleng - 3
unknown flag
#81 opened by guleng - 7
dcgm exporter produces 404 page not found
#78 opened by bbelgodere