utkuozdemir/nvidia_gpu_exporter

[Feature Request] Memory Usage per GPU Process

Opened this issue · 4 comments

We recommend adding RAM usage visualization by GPU processes to your product, to help users monitor the performance of processes more completely and accurately.

Can I contribute to that feature project?

Thank you for the recommendation and the offer. Can you elaborate on what you mean by adding RAM usage? Are you talking about the GPU memory usage, or the system memory? And do you mean only the processes that are using the GPU? (like using nvidia-smi --query-compute-apps?

Reason I'm asking is, I want to keep this project to monitor Nvidia GPU-related stuff only. For all other things, there are way better solutions, like the node exporter.

I am only referring to the GPU RAM. I want to add a metric to monitor the processes using GPU RAM, like the command:
nvidia-smi --query-compute-apps=pid,process_name,used_gpu_memory --format=csv
To be able to monitor GPU memory usage in more detail

I see, thank you, you are very welcome to contribute. Just please keep in mind that it might take quite some time for me to review/give feedback on it, so it might need some patience (see the maintenance status warning on the README).