NVIDIA/gpu-monitoring-tools

DCGM_FI_DEV_GPU_UTIL Abnormal Output

Jea-Eok-Kim opened this issue · 0 comments

Hello
You want to use the gpu monitoring system in grafana using dcgm-exporter and prometheus, but the value of gpu usage is strangely output.

Grafana screen output is as follows.

스크린샷 2021-01-25 오전 9 52 46

As you can see on the screen, GPU utilization is 9223372036854776000%.

The dcgmidmon -e 203 (DCGM_FI_DEV_GPU_UTIL) command results are as follows:

스크린샷 2021-01-25 오전 10 03 10

The MIG Enable status is as follows.

스크린샷 2021-01-25 오전 9 49 55

Please let me know if there is a way to obtain GPU usage rate even in MIG Enable.

Please help me.